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Abstract 

■ We consider the problem of balancing load items (tokens) on networks. Starting with 

an arbitrary load distribution, we allow in each round nodes to exchange tokens with their 
neighbors. The goal is to achieve a distribution where all nodes have nearly the same number 
of tokens. 

For the continuous case where tokens are arbitrarily divisible, most load balancing 
schemes correspond to Markov chains whose convergence is fairly well-understood in terms 
of their spectral gap. However, in many applications load items cannot be divided arbi- 
trarily and we need to deal with the discrete case where the load is composed of indivisible 
tokens. This discretization entails a non-linear behavior due to its rounding errors, which 
makes the analysis much harder than in the continuous case. Therefore, it has been a major 
open problem to understand the limitations of discrete load balancing and its relation to 
the continuous case. 

We investigate several randomized protocols for different communication models in the 
discrete case. Our results demonstrate that there is almost no difference between the discrete 
and continuous case. For instance, for any regular network in the matching model, all nodes 
have the same load up to an additive constant in (asymptotically) the same number of 
rounds required in the continuous case. This generalizes and tightens the previous best 
result, which only holds for expander graphs (STOC'09). 

Keywords: randomized algorithms, parallel and distributed algorithms, graph expan- 
sion, Markov chains, load balancing. 
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1 Introduction 



Consider an application running on a parallel network with n processors. Every processor has 
initially a certain amount of tokens (tasks) and the processors are connected by an arbitrary 
graph. The goal of load balancing is to reallocate the tokens by transferring them along the 
edges so that eventually every processor has almost the same number of tokens. 

Load balancing is a well-studied problem in distributed systems and has manifold applica- 
jhashing 28], routing 12], numerical computation such as solving partial 
IsL 40) 42] and simulating dynamics 10]. This trend has been reinforced 

rocessors 
9]. Espe- 



tions in scheduling 39l | . h 
differential equations 

by the flattening of processor speeds leading to an increasing usage of multi-core p 
[a, 25] and the emergence of large decentralized networks like P2P networks 0, 21, El 
cially for large-scale networks, it is desirable to use local and iterative load balancing protocols, 
where every processor only needs to know its current and the neighboring processors' loads and 
based on this decides how many tokens should be sent (or received). 

A widely used approach is the so-called diffusion (i.e., the first-order-diffusion scheme [1 
32}]), where the amount of load sent along each edge in each round is proportional to the load 



difference between the incident nodes. The alternative is the matching model where in each 
round there is a matching and only those edges can be used for averaging the load. 

We measure the smoothness of the load distribution by the so-called discrepancy which is the 
difference between the maximum and minimum load among all nodes. In view of more complex 
scenarios where jobs are eventually removed or new jobs are generated, the discrepancy seems 
to be a more appropriate measure than the makespan, which only considers the maximum load. 

Many studies on load balancing assume that the load is arbitrarily divisible. In this contin- 
uous case, the diffusion scheme corresponds to a Markov chain on the graph and one can resort 
to a battery of established techniques to analyze the convergence speed [3, |l9|, |32fl. monartic- 



ular, the spectral gap captures the time to reach a small discrepancy quite accurately [35l. l37|. 



This relation continues to hold for the matching model, even if the matchings are generated 



11|,131|. 



randomly, which might be necessary for graphs with no canonical matchings 

However, in many applications a processors' load may consist of tasks which are not further 
divisible, which is why the continuous case is also referred to as "idealized case" [jjl ]. A common 
way to model indivisible tasks is the unit-size token model where one assumes a smallest load 
entity, the unit-size token, and load is always represented by a multiple of this smallest entity. 
In the following, we will refer to the unit-size token model as the discrete case. Because of the 
close relation between continuous load balancing and Markov chains, many authors 0, [26, 31 



321 . l35l . l38l ] asked for a characterization of the convergence speed of discrete load balancing, or 



alternatively, a quantification of the deviation between the discrete and the continuous case. 
Unfortunately, the discrete case is much harder to analyze due to its nonlinearity caused by the 
roundings to whole tokens in each round. 

Muthukrishnan et al. [32] proved the first rigorous result for the discrete case in the diffusion 
model. They assume that the load amount sent along each edge is obtained by rounding down 
the load amount that would be sent in the continuous case. Using this approach, they showed 
that the discrepancy is at most O(xzx) after O(^frj^) rounds, where d is the degree, K is 
the discrepancy of the initial load vector and 1 — A is the spectral gap of the diffusion matrix. 
Similar results for the matching model were shown by Muthukrishnan and Ghosh 31]. 

Further progress was made by Rabani et al. (35| who introduced the so-called local divergence, 
which is a natural parameter that essentially aggregates the sum of load differences over all edges 
in all rounds. For both the diffusion and matching model, they proved that the local divergence 
yields an upper bound on the maximum deviation between the continuous and discrete case 
for the aforementioned rounding down approach. They also computed the local divergence for 
different networks such as torus graphs and proved a general upper bound which translates into 
a discrepancy bound of O ( d l° s ™ ) after O ( log ^ K ^ ) rounds for any (i-regular graph. 
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While always rounding down may lead to a quick stabilization, the discrepancy could be 
quite large, i.e., as large as the diameter of the graph (in case of diffusion, it could be even 
the diameter times the degree). Therefore, Rabam et al. also suggested to use randomized 
rounding in order to get a better approximation of the continuous case. Herlihy and Tirthapura 
[2o| analyzed such a protocol for the hypercube in the matching model and proved a discrepancy 
bound of O{y/\og n) after log 2 n rounds. Friedrich and Sauerwald (l7| presented the first general 
analysis of this randomized protocol in the matching model. By analyzing the ^-version of 
the local divergence, the so-called local 2-divergence, they proved that on many networks, the 
randomized protocol yields a square root improvement in terms of the achieved discrepancy 
compared to the deterministic protocol from [35( ]. 



Recently, Berenbrink et al. [8|] extended some of the results from 17| to the diffusion model. 
One general challenge in the diffusion model is that nodes may receive too many (or too few) 
tokens in a single round, since all neighbors have to make their decisions locally and independent 
of each other. This might explain why most discrepancy bounds for diffusion depend on the 
degree of the network and are thus weaker than the corresponding bounds for the matching 
model. Additionally, also practical simulations seem to favor the matching models, especially 



for fine load balancing 41]. 



Closely related to our problem addressed here are balancing networks 0], which are siblings 
of sorting networks with comparators replaced by balancers. Klugerman and Plaxton [23| gave 
the first construction of a balancing network of depth O(logn) which achieves a discrepancy of 
one. Their network relies on the famous AKS sorting network 0]. Rabani et al. [35I derived 
results for other networks, but these involve a much larger depth. All of these results [5, 23|, 35] 



require each balancer to be initialized in a special way, while our randomized protocols do not 
require any specific initialization and therefore seem to be more practical. 

There are also studies in which the nodes are e quip ped with additional abilities compared 
to our model. For instance, Even-Dar and Mansour [16[ analyzed a load balancing model where 



every node knows the average load. Elsasser and Sauerwald 15] analyzed an algorithm which 
uses random-walk based routing of positive and negative tokens to minimize the makespan. 

While all aforementioned load balancing protocols can send an arbitrary number of tokens 
along the used edges, several studies consider an alternative model in which only a single token 
can traverse along each edge in each round. Obviously, the convergence is much slower and at 
least linear in the initial discrepancy (for concrete results, see Aiello et al. 0], Ghosh et al. (l9|). 

Finally, many (re-)allocation schemes have been analyzed which are based on the famous 
power-of-two choices paradigm for balls-and-bins models (cf. Mitzenmacher (3o|). While there 
are results in distributed settings (e.g., Adler et al. [l|], Lenzen and Wattenhofer [24|])> most of 
them assume a complete graph as the underlying network. One notable exception is the work 
of Kenthapadi and Panigrahy [22]. However, they only considered the sequential allocation of n 
tokens and required the degree to be at least polynomial in n in order to achieve a discrepancy 
of 0(log logn). 

Our Results. We analyze several natural randomized protocols for indivisible tokens. All 
protocols have in common that randomized rounding is used to "imitate" the behavior of the 
continuous case in each round. Our main result for the matching model is as follows: 

Theorem 1.1. Let G be a regular graph with n nodes and K be the discrepancy of the initial load 
vector. There is a constant c > independent of G and K, so that w.p. 1 — exp(— (log n)^ 1 '), 
the discrepancy is at most c after O ^ ) roun ^ s ^ n the random matching model. This also 

holds after O ( fz^p^j ) rounds in the balancing circuit model if d is constant. 



For precise definitions of both models, A(P), A(M) and d, we refer to Section[2] 
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Graph Family 


Rounds 


Discrepancy 


Model 


Ref. 






O(logn) 


det. (BC) 




Constant-Degree 
Expander Graphs 


OQog(Kn)) 


O(loglogn) 
0(1) 


rand. (BC & RM) 
rand. (BC & RM) 


m 

Thm. O 



r-dim. 
Torus Graphs 



0(log(Kr 



0(n 



det. (BC) 



0(n 1/(2r > v / logi) 
C(n 1/(2r) logn) 
O(l) 



rand. (BC) 
rand. (RM) 
rand. (BC & RM) 
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Thm.O 



Regular Graphs 







/ log(ifn) 



/ d log n \ 



det. (BC) 



o(i) 



rand. (BC) [13] 
rand. (BC, d = 0(1)) Thm.O 



0(1) 



(log re) 3 
1-A 



rand. (RM) 
rand. (RM) 



Thm. O 



Arbitrary Graphs 



O 



trt I d-log r. 
w 1 1-A 



det. (BC) 







rand. (BC) 



0(T cont (K,n- 2 )) 0((logn) £ ) 
C(T cont {K, n" 2 ) ■ (log log n)) 0(log log n) 



rand. (BC & RM) 
rand. (BC & RM) 
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Thm.l3~71 
Thm. EH 



Table 1: Comparison of the results for the matching model with the previously best results. The initial discrepancy 
is denoted by K, and 1 — A denotes the spectral gap. Here, det. and rand, refer to the deterministic and randomized 
orientation, respectively. BC (RM) stand for the balancing circuit (random matching) model, respectively. Note that 
Tcont (K, n~ 2 ) is the time for the continuous process to reach a discrepancy of n~ 2 w.p. 1 — n _1 . For the precise definitions, 
see Section [2] 



The two bounds on the runtime in Theorem 11.11 match the ones from the continuous case 
up to a constant factor (see Theorem 12.21 and Theorem 12. 5p . The previous best result for this 
protocol holds only for expander graphs and the number of rounds is a factor (log log n) s larger 



than ours 17]. For expander graphs and K = poly(n), our algorithm needs only O(logra) rounds, 
which would be even necessary for any centralized algorithm. For general graphs, all previous 
bounds on the discrepancy include the spectral gap 1 — A. Therefore, especially for graphs 
which have small expansion like Torus graphs, our main result represents a vast improvement 
(Tabled]). 

We further analyze the matching model on non-regular graphs and our result (Theorem I3.7P 
is almost tight, since the discrepancy is O(loglogn) and the runtime is only an O(loglogn) 
factor larger than in the continuous case. Together with Theorem II. 1[ these results show that 
for arbitrary networks, there is almost no difference between the discrete and continuous case. 

Finally, we also study two natural diffusion-based protocols in the discrete case @, Our 
discrepancy bounds there depend only polynomially on the maximum degree A and logarithmi- 
cally on n, while again all previous results include the spectral gap or are restricted to special 
graph classes @, H, 32, 35]. 



Our Techniques. Our main results are based on the combination of two novel techniques 
which may have further applications to other problems. First, instead of analyzing the rounding 
errors for each edge directly [8, 17, 31, 32, 3^], we adopt a token-based viewpoint and relate the 



movements of tokens to independent random walks. This establishes a nice analogy between the 
distribution of tokens and the well-studied balls-and-bins model (see Corollary 13. 4|) . Secondly, 
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we employ potential functions to reduce the task of balancing an arbitrary load vector to the 
task of balancing a sparse load vector, i.e., a load vector that contains much fewer tokens than 
n. Especially for these sparse load vectors, the token-based viewpoint yields much stronger 
concentration inequalities than the ones from previous work. 

All of our discrepancy bounds make use of the so-called local 2-divergence, which has been 
one of the most important tools to quantify the deviation between the continuous and the 



discrete case [8|, [17|, [35J . We prove that for any graph and any sequence of matchings, the local 



2-divergence is between 1 and y/2, while all previous bounds on the local divergence include 
graph parameters such as the spectral gap or the (maximum) degree. For the diffusion model, 
the local 2-divergence is essentially given by 0(\/A), where A is the maximum degree. Prior 
to this, all bounds on the local 2-divergence in both communication models depend on the size 
and expansion of the graph, or are restricted to certain graph classes. 



Organization. The remainder of this paper is organized as follows. Section [2] introduces the 
matching-based model and presents some basic results. In Section [3] we introduce our new 
technique that relates the movement of the tokens to independent random walks. Based on this 
technique, we derive results on the discrepancy that hold for arbitrary graphs (see Section f3.2p . 
The proof of our main result (Theorem II. 1|) is given in Section UJ Finally, Section [5] contains 
our results for the diffusion model. 



Notations. We assume that G = (V, E) is an undirected, connected graph with n nodes, 
indexed from 1 to n. Several inequalities in this paper require that n is sufficiently large. For 
any node u, let N(u) be the set of neighbors of node u and d{u) := \N(u)\ the degree of node u. 
The maximum degree of G is denoted by A := max u d(u). By diam(G) we denote the diameter 
of G. Following 35], we use the notation [u : v] for an edge {u, v} £ E with u < v. For any 
vector x = (x±, . . . ,x n ), the p-norm of x is defined by ||x|| p := (X^Li I* 1 particular, 

Hxlloo := maxi^j^„ For any n by n real symmetric matrix M, let Ai(M) ^ . . . ^ A„(M) be 
the n eigenvalues of matrix M. For simplicity, let A(M) := max{|A2(M)|, |A n (M)|}. By log(-) 
we denote the natural logarithm. 



2 The Matching Model 

In the matching model (also known as dimension exchange model reflecting its seminal ap- 
plication to hypercubes), every two matched nodes in round t balance their loads as evenly as 
possible. This can be expressed by a symmetric n by n matching matrix , where with slight 
abuse of notation we use the same symbol for the matching and the corresponding matching 
matrix. Matrix is defined by M% := 1/2, m£, := 1/2 and M$ v = M$i := 1/2 if 

{u, v} £ MW C E and M$, = 1, M$, = (u ^ v) if u is not matched. We will often consider 
the product of consecutive matching matrices and denote this by Ml* 1 '* 2 ] = IJ^Lt M( s ) f° r t wo 
rounds t\ ^ t<i. If t\ ^ t<i + 1, then M^* 1 '* 2 ! is defined as the n by n identity matrix I. 



2.1 Balancing Circuit and Random Matching Model 

In the balancing circuit model, a certain sequence of matchings is applied periodically. More 
precisely, let M^, . . . , be a sequence of d matching matrices Then in round t ^ 1, 
we apply the matching matrix M^*) := M^' -1 ) modd )+ 1 ). Following [35|, we define the round 
matrix M := nf=i M(s) - Further, let A(M) := max {|A 2 (M)|, |A n (M)|}. We always assume 

2 Traditionally, the variable d has been used for the number of matchings [l7l.[35|. There may not exist a direct 
relation between d and the (maximum) degree of the underlying graph G. However, the graph induced by the 
union of the d matchings has maximum degree at most d. 



4 



that A(M) < 1 which is equivalent to the matrix M being ergodic. A natural choice for the d 
matching matrices is given by an edge coloring of graph G. There are various efficient distributed 
edge coloring algorithms (see for example, Panconesi and Srinivasan 33|, 34|). 



The alternative to the balancing circuit model is the random matching model, where one 
generates a random matching in each round. There are several simple and distributed random- 



ized protocols to generate such matchings. For instance, [3l| analyzed a two-stage protocol for 
(i-regular graphs where in the first stage every edge is picked independently with probability 
0(l/d). In the second stage, we consider the matching formed by all edges that are not incident 



to any other edge chosen in the first stage. A similar protocol was studied in which also works 
for non-regular graphs. These protocols have two natural properties which are sufficient for our 
analysis. First, we have Pm'm — 

fi(l/A), where p min := min teN mm {ll:V}eE Pr [{u,v} G M^]. 
Secondly, random matchings generated in different rounds are mutually independent. 

2.2 The Continuous Case 

In the continuous case the load is arbitrarily divisible. Let be the initial load vector and 
in every round two matched nodes balance their loads perfectly. It is easy to see that this 
process corresponds to a linear system and the load vector £w, t ^ 1, can be expressed as 
£(t) = £(t-i) M (t) ; wriich resu it s i n £(*) = f(o) M^'l Moreover, 

e(t) = *•(*-!) + V ff^^MO -fHMO 

S>u Su 1 / j I Sd v,u S,u u,v 

V : {u,v}GE 



v. {«,«}eM( f ) 

We define the average load by £ := YlweV £v> / n ' wri ich is invariant of the round t. Note that the 
convergence in the continuous case depends on the randomly chosen matchings in the random 
matching model, while it is "deterministic" in the balancing circuit model (for fixed initial load 
vector) . 

Definition 2.1. Let G be any graph. Fix any pair (K,e) with K e > 0. For any pair 
of integers t\ < t<i, we call a time-interval [ti , associated with a sequence of matchings 
(M(* 1+1 ), . . . ,M(* 2 )) (K,e) -smoothing if for any G R n , disc (^) < K => disc (^* 2 )) < e, 
where the discrepancy of any load vector £ is defined by disc(£) := max |£ u — £ v \. 

U,VdV 

• For the balancing circuit model, define 

T~cont(K,£) '■= min{t G N: [0, t] is (K , e) -smoothing} , 

i.e., T COQt (K,e) is the minimum number of rounds in the continuous case to reach discrep- 
ancy e for any initial vector with discrepancy at most K. 

• For the random matching model, define 

Tcont(^, e) := niin [t G N: Pr [ [0, t] is (K, e) -smoothing] ^ I — n -1 } , 

i.e., T con t(K,e) is the minimum number of rounds in the continuous case so that with 
probability at least 1 — n~ l , we reach a discrepancy of e for any initial vector £^ with 
discrepancy at most K . Note that the probability space is taken over the t randomly chosen 
matchings , . . . ,M^. 

Note that in both the balancing circuit and the random matching model, T cont (K,e) is not 
a random variable. 

Following previous works 3^, 35], we adopt the view that the continuous case (r cont (K, e)) 



is well-understood and our goal is to analyze the discrete case. For the balancing circuit model, 
there is indeed a natural bound on T cont (K, e) depending on the spectral gap of M. 
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Theorem 2.2 ([35l. Theorem 1]). Let G be any graph. Consider the balancing circuit model 
with d matchings , . . . , M( d ' . Then for any e > 0, T cont (K, e) ^ d ■ j^(M} " lo § ('if ) ■ 

For the random matching model, our result will depend on p m i n and the spectral gap of the 
diffusion matrix P, defined as P u ,v '■= j% if {u, v} G E, P U)V := 1 — if v = u, and P U)V := 
otherwise. 



Theorem 2.3 ( 3jJ, Theorem 1]). Let G be any d-regular graph. Consider the random matching 
model in the continuous case and let p m i n be the minimum probability for an edge to be included 

in the matching. Define the quadratic potential as $W := Y2ueV ~ ^) • Then f or an U 
round t, we have 

d-Pn 



E 



$(*) 



^ 1 



•(1-A 2 (P)) -$(°) 



Theorem 12.31 implies the following corollary. 
Corollary 2.4. For any node v G V and round t, it holds for a constant c > that 



Pr 



M 



u] 



< - + e^ 1 ^' 
2 n 



-c-(l-X 2 )-t 



Proof. Let £/°) be the initial load vector with Q := 1 and '■= for u ^ v. Then, 

£(*) = V £(°) • M' 1 '*] = 1 • Ml 1 ''! 



and £ = 1/n. By Theorem 

E 



< ( 1-f (1- A 2 (P))-p mi „) -# ,r ". 



Since p min = let p min ^ e'/d for a constant c' G R. Using $(°) = (l — ±) 2 + (n — 1) 

(£) = 1 - ^ 1, we obtain 



E 



$(*) 



^ ( 1 _ ^ . (1 _ A 2 (P)) ) . 1 < e-^/^-a-Aa)-* 



Hence by Markov's inequality, 



Pr 



$W > e -c'/8.(l-A 2 )-i 



< e 



-c'/8-(l-A 2 )-t 



Assuming that ^ e c '/ 8 '( 1 A2 )'* implies for any node u £ V, 



Ml 1 '*! 



£ l 



(0 



■;? 



< e -c'/8-(l-A 2 )-t 



and rearranging yields 



Ml^I + e-«-^. 
n 

By setting c := c'/16 we finish the proof. ■ 
Theorem 12.31 also implies the following upper bound on T cont (K, e). 

Theorem 2.5. Let G be any d-regular graph and consider the random matching model. Then 
for any e > 0, it holds that 



T cont (K,e) ^ 



d-Pmin 1-A 2 (P) 



log 



Kn 1 
7/2 
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Hence for p m i n = 0(l/d), we obtain essentially the same convergence as for the first-order- 
diffusion scheme (cmp. Theorem I5.ip . although the communication is restricted to a single 
matching in each round. A more complicated but also more general result for non-regular 
graphs can be found in ll|, Theorem 5]. However, since our proof for non-regular graphs does 
not require a concrete runtime bound, we prefer to state our results for non-regular graphs in 
terms of T cont (K,e). 

Proof of Theorem \2.5[ Fix any vertex v G V and let be the unit vector which has 1 at 
position v and otherwise. Define the quadratic potential function as $(*) = ^2 u£V (£u — £) 2 - 



Hence, $(°) = 1 - ± < 1. Choosing t := ■ • log (fr^) in Theorem O yields 



E 



$(*) 



< 1 



■(1-A 2 (P))) -$(°) ^e- log 



(K 2 n 4 )+log(e 2 /4) 



Hence by Markov's inequality, 



Pr 



AK 2 n 2 



n 



2 ' 



If $W ^ kK^rfi ' then for every node u G V, we have 



and = , it follows that 



< Vife = 2Kn- SillCe £ = n 



Ml 1 '*! - - 

<^ n 



2Kn 



Therefore, 



Pr 



Vu G V: 



Ml 1 '*! 



n 



2ifn 



^ 1 - n 



-2 



Replacing the initial load vector by the other unit vectors, repeating the above argument 
and taking the union bound gives 



Pr 



Vu G V,Vu G V: 



Ml 1 -*! 



1 



2Kn 



> 1 — n • n 2 = 1 — n 1 . 



M [1,t] - 1 



For the remainder of the proof, assume that for all nodes u, v G V, 
Then if we start with any initial load vector with discrepancy at most K, then 

e(t) = ^ f (0) . M (t) 

1)61/ 

= E?- M ^ + E^ 0) -o- M £ 

= I + E^ 0) - o ■ - + £(d 0) - o ■ (m& - 1) 



< h ° lds - 



^t + Y k ■ — 



Similarly, £^ ^ £ — e/2. Hence the discrepancy of £ at the end of round t is at most e. 
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2.3 The Discrete Case 

Let us now turn to the discrete case with indivisible, unit-size tokens. Let G 1 n be the 

initial load vector with average load x := Ylwev x ^ l n anc ^ x ® ' 3e * ne ^ oa( ^ vec tor a t the end 

of round t. If the sum of tokens of two matched nodes is odd, we have to decide which of 

the matched nodes should get the excess token. To this end, we employ the so-called random 

orientation ([UHil) in the spirit of randomized rounding. More precisely, for any two matched 

r .(t-i) , (t-i)i (t-i) (t-i) 
nodes u and v in round t, node u gets — — or — — tokens, with probability 

1/2 each. The remaining tokens are assigned to node v. We can also think of this as first 

(t-i) (t-i) 

assigning — — tokens to both u and v and then assigning the excess token (if there 

is one) to u or v with probability 1/2 each. We use a uniform random variable $u,v G { — L 1} 
to specify the orientation for an edge {u,v} in M^, i.e., indicating where the excess token (if 
any) is assigned to. If $u,v = 1, then the excess token is assigned to u and if $u}v = ~~ 1; then 
the excess token is assigned to v. Note that $u,v = — &v,u- Further, we point out that the 
deterministic orientation of Rabani et al. [35] corresponds to $1*^ = 1 for x\i ^ ^ Xy ^ and 
®ujv = — 1 otherwise. 

For every edge {u, v} which is part of the matching M^) , define the corresponding error 
term by 

e (t) . = ioddfx^- 1 ) + x^-V) ■ $W 

where Odd(x) := xmod2. Moreover, for any round t we define an error vector with : — 

Ylv. (iiiijeMO e «>«- With this notation, the load vector in round t is = 1''"%^ + e®. 
Solving this recursion (cf. jil]) yields 

x (t) = x (0) M [i,t] + J- eWM^ 1 -') = + e (s) M [s+1 '* ] , 

s=l s=l 

where £w is the corresponding load vector in the continuous case initialized with = x^ ' . 
Hence, for any node w G V we have 

t t 
x (t)_£(t) = W V e^M^ 1 '*! = V V e (s) (m^-'I-M^I (2 1) 

8=1 uev v: (« )B }eMW s=i [u;»]eMW 

where the last equality used = — e^l (recall that [u : v] G M( s ) means {«, G M( s ) and 
u < v). Occasionally it will be convenient to "normalize" the load vector so that x G [0,1) 
(cmp. Observation 12. 6p . Although this may lead to negative entries in the load vector, the 
above formulas still hold. 

Observation 2.6. Fix a sequence of matchings Ai = (M^ 1 ** , M^ 2 ) , . . .) and orientations $ujv, 
[u : v] G M>*),< g pj. Consider two executions of the discrete load balancing protocol with the 
same matchings and orientations, but with different initial load vectors, x^ and x^ . Then the 
following statements hold: 

1. If = + a ■ 1 for some a eZ, then = + a ■ 1 for all t G N. 

2. If x { u ] < x[ 0) for all ueV, then x^ < xl <} for all u G V and t G N. 

The next lemma says that upper bounding the maximum load is essentially equivalent to 
lower bounding the minimum load. 
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Lemma 2.7. Fix a sequence of matchings Ai = (M^ , MA 2 ) , . . .). For any triple of non-negative 
integers K , a with 1 ^ a ^ K and t, 

,(0) 



max \ Pr 

disc(j/)^i^ 

and similarly, 
max < Pr 

disc(y)^K 



a 



X 



(t) 



a 



,(0) 



y 



^ max 

yeZ n : 
disc(y)^K 



> ^ max < 

disc(j/)^/^T 



Pr 



Pr 



*2n < L^J 



,(0 



^ |_xj + a ~ 3 



,(0) 



}• 



Proof. We define a coupling between two executions of the load balancing algorithm. For every 
round t, the two executions use the same matching. In the first execution, we start with 



-(0) 



z £ Z™ that maximizes Pr 



-(*) 



.(0) 



and satisfies disc(z) ?J K. The 



^ [x\ + a 

load vector of the second execution is denoted by 5fw and is initialized by 

4°) := [z\ - (z u - [z\ ) = 2 ■ [z\ - z u , 

for any u € V. Note that disc^ -*) = disc(z) ^ K and x ^ x — 2. We couple the random 
choices of the two executions by setting for every [u : v] € M* 5 ), 1 ^ s ^ t, $u}v = — &u]u, where 
denotes the random choices of the second execution. We now claim that for every round 

8 ^ 1, 



2- [x\ 



(2.2) 



This claim is shown by induction on s as follows. First consider a node it which is matched with 
a node u in round s. Hence, 



z(«) 



and using the induction hypothesis yields 



;(«) - I 



(2Lxj - XI s - 1 )) + i • (2[x\ - x^) + \ ■ 06d{xt l) +xt l) ) ■ 



2[x\ - Q • x[ s - x ) + ~ • 4 S " 1} + 5 • Odd(2Lxj - 4 s - 1 ) + 2Lxj - • *«) 

2LxJ 



where the last equality holds since Odd(2|xJ - xl s 1} + 2[xJ - xi" *') = Odd{x y u *' + x. 

If a node u is not matched in round s, then the claim follows directly by the induction 
hypothesis. Hence, (|2.2|) holds which implies that for any 1 ^ a ^ K, 

max < Pr 

y<=Z n : I 
disc(y)^K 



Odd(xl s - 1} 



(s-lh 



X 



(t) 



^ [x\ + a 



y 



Pr 



.(0) 



= Pr 
^ Pr 



a 



x 



(o) 



4 < L(50J - ot + 3 



2 • [.*] • i - 

c(°) = 2 • [*J • 1 



Pr 



,(*) 



^ L X J — a + 3 



sC max 
j,ez n : 

disc(y)^K 

The second inequality is shown in exactly the same way. This completes the proof. 



(0) 



y 



} 



Remark 2.8. When referring to the random matching model (discrete case), the probability 
space is over the randomly generated matchings and the randomized orientation of the matchings. 
For the balancing circuit model (discrete case), the probability space is over the randomized 
orientation of the (deterministic) matchings. 
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2.4 Local Divergence and Discrepancy 

To bound the deviation between the discrete and continuous case, we consider max \ x$ — 

weV 

at all rounds t and define the local divergence for the matching model. 

Definition 2.9 (Local p-Divergence for Matchings). For any graph G, p G Z + and an arbitrary 
sequence of matchings M. = (M^ , M^ 2 ) , . . .), the local p-divergence is 



%(M) = max sup £ £ - 

1 fc s=l [ ra ] eM (») 



v VP 
P ' 



Comparing the above definition with (|2.1|) . one can see that ^\(Ai) is a natural quantity 
that measures the sum of load differences across all edges in the network, aggregated over time 
3 a] and $/ p (A4) is the pth norm of ^i(A4). Next we turn to our upper bound on the local 
2-divergence. 

Theorem 2.10. For any graph G and any sequence of matchings Ai, ^(Al) ^ V 2 — 2 • n _1 . 
Moreover, if there is a matching M^' in Ai such that the set ^ 0, then ^2(Ai) ^ 1, 

otherwise ^(-M) = 0. 

While all previous upper bounds on the local divergence are functions of the expansion, 
the degree and/or the number of nodes 0, 17, HH], Theorem 12.101 establishes that the lo- 
cal 2-divergence is essentially independent of any graph parameter. Additionally, the local 
1-divergence is always lower bounded by the diameter of graph G (cf. [13]). 

Proof of Theorem \2.1(A Fix any pair of node w G V and round t. For any 1 ^ s ^ t, define the 
following potential function: 



*« := £ ( Mfc^ - 1 



2 

n. 



Observe that since M^ 1 ^ is the identity matrix, $® = 1 • (l - ±) 2 + (n - 1) • (±) 2 = 1 - ~. 
Consider now any round 1 ^ s $J t, and let u, v be nodes with [it : u] G 1VI^ S ^. Let y u : — IVlj^u ' 



and y v := M^ 1 '*'. Note that 



] Tvyrls+l.t] _ Vu + Vv 



2 ' 

fcGV 

and similarly, m|^$ = Vu+yv . Therefore, the contribution of u and w to $^ — <£>( s_1 ) is equal to 

M [ S +i,*] _ iy + ( M [s+i, t ] _ iy _ / M [ S , t ] _ iy _ / M [»,t] _ iy 

1 \ 2 , Z' iV ( Vu + Vv 1\ 2 (Vu + Vv 1 XZ 



y« — + y« — , 

ny \ ny V ^ n / V ^ n 
2 2 1 . 2 2 1 o ( ^ Vu + y ^) 2 Vu + Vv t 1 

y« — y« + -o + 2/t; — y« + t ~ 2 • 7 h t 

n n z n n z \ 4 n n z 

2 , 2 _ y 2 + 2 ^»^ + y 2 

Vu + Vv 2 

2/ 2 I ?/ 2 / \2 

y - VuVv + ^ = ^-{Vu-Vv) ■ 
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If a node is not matched in round s, then its contribution to — $( s ^ equals zero. Accu- 
mulating the contribution of all nodes yields 

^ ^ / j 2 y w,u *- yi -w,v j 

[n:«]eM( s ) 

For the upper bound on ^(-M), we take the sum over t rounds to obtain that 

E E ^K^-M^M)^^^)-^- 1 )) 

*=1 [u:»]eM( s ) s=l 

= $(*) _ $(0) ^ 1 _ I 

n 

and thus 

E E (M^-M^f^Yl-i), 

8=1 [u:ti]eM( s ) 

which directly implies that ^(-M) ^ \/2 — 2 • n _1 . For the lower bound, consider any round t 
with [«:«]€ MW . Clearly, 

M [t+i,t] = M [t+i,t] = 1 d M [t+i,t] = M [t+i,t] = o 

and hence 



\ [«:»]eMW 



Using the same arguments as in the proof of Theorem 12 .101 we obtain the following corollary. 

Corollary 2.11. Xei G = (V, E) be any graph with an arbitrary sequence of matching s A4 = 
(M^ 1 ), M^ 2 ), . . .). For an arbitrary node w £ V and any pair of rounds t\ < 1%, it holds that 

E E ( M te 1,t2] - ™tf M ) 2 ^i-E ( M fc +1,t2] - £f • 

s=l [u:v]eM( s ) uev 

Proof. The proof uses similar arguments as the proof of Theorem l2.10l Fix any node w €V. For 
any 1 ^ s ^ t2, define <I>( S ) = Y1 U £V (mI^ 1 '* 2 ' — - As shown in the proof of Theorem 12.101 

s(>) _ *(*-!) = V i . f M[ s+1 '* 2 ] - m[ s+1 '* 2 1 

^ ^ / j 2 y w,v 

[ti:«]£MW 

Therefore, 



E E ( M S 1,t2] - M ft 1,42] ) = 2 E ( $(s) - * (s-1) ) = 2#(tl) - 2$(0) < 2<&(tl) ' 



=1 [u:»]eM( s ) s=l 

which completes the proof. 
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Lemma 2.12. Fix two rounds t\ < t 2 and the load vector x^ 1 ^ at the end of round t\. For any 
family of non-negative numbers gu)u ([u : v] G M( s ),ii + 1 ^ s ^ t 2 ), define the random variable 
Z by Z := YllL^+i 12[u-.v]eMM 9u}v ■ ei*£. Then E [ Z] = and for any 5 > it holds that 



( 



Pr[|Z-E[Z]| ^ 5] s$ 2exp 



5 2 



Proof. The proof of this lemma is similar to [g, Proof of Theorem 1.1, first statement]. 



Since E 



6m, v 



-- for all {u,v} G M^, it follows that E[Z] = 0. Our goal is now 
to prove that Z is concentrated around its mean by applying the concentration inequality in 
Theorem IA.1L Observe that Z depends on at most (n/2) • (t 2 — t\) random variables e^l, and 
each of them corresponds to one of the orientations of at most n/2 matching edges in each of the 
t 2 — t\ rounds. Let us denote this sequence by Yg with (t\ + 1) • (n/2) + 1 ^ I ^ (t 2 + 1) ■ (n/2), 
where Yg with I = a ■ (n/2) + /3, t\ + 1 ^ a ^ t 2 , 1 ^ /3 ^ n/2, describes the orientation of 
the /3-th edge in round a (here we take an arbitrary ordering of the matching edges in round 
a, and if there are less than /3 matching edges in round a, then Yg = 0). 

In order to apply Theorem IA.1} we first verify that for every (t% + 1) • (n/2) + 1 ^ £ ^ 
(t2 + 1) • (n/2) with £ = a ■ (n/2) + [3 and {u 1 , v'} being the /3-th matching edge in round a, 



|E|> I % 



+l)-(n/2)+l> 



,y,]-E[Z I Y (tl+lHn/2)+1 ,...,Y i _ 1 ]\^g$ v , 



(2.3) 



In order to simplify the notation, let y& := (Yr tl+1 y r n /2)+i, Ye) for any £ with (ii + 1) 
(n/2) + 1 < £ ^ (t 2 + 1) • (n/2). 

To prove (|2.3p . we split the sum of Z into three parts: s < a, s = a and s > a. 



Case 1: t% + 1 ^ s ^ a — 1. For every [u : v] G M^ s \ eg is determined by 3V- 1- Hence, 



E 



0-1 



/ j / j ilu,v ^u,v 

S=tl + 1 [«;»]eM( s ) 



yt 



E 



a-l 



/ J / J ilU,V Uy 

s=*l+l [«:»]eM( s ) 



3^-1 



Case 2: s = a. Then, 



E 



V Q {s) ■ e« 

_ [ti:»]eM( s ) 

< £ | E 

[u:i)]eM('l 
[u:v]y£[u':v'] 



Q (s) . & {s) 



E 



_ [u:«]eM( s ) 



yi- 



y t 



E 



\ q {s) >e W 



+ 



E 



(<0 (s) 



y<: 



E 



0) to 
y u ' v' ' e u\v' 



yi- 



< a {s) 



where the last inequality holds since eg, [u : v] £ [u : u] / [1/ : ?/], are independent of 

e% and e^, G {-1/2, 0, 1/2}. 

Case 3: a + 1 ^ s ^ £2- Let ^ ^ £ be the smallest integer so that yj determines the load 
vector x^ a \ By the law of total expectation, for any {u,v} G M( s ), 



E 



eg, I % 



E 



E 



^u,v I 



e[o 1 y t ] = o, 



12 



and the same also holds if we replace I by I — 1. Hence by linearity of expectation, 



E 



/ j / j ilU,V Uy 

s=a+l [u: V ]£M( s ) 



ye 



E 



V V Q {s) • e (s) 

s=a+l [u:v}£MM 



Combining the contribution of all three cases establishes (|2.3[) . Applying Theorem lA.il finishes 
the proof. ■ 

We now list the following Chernoff-type bounds which can be derived quite easily from 
Lemma l2.121 Similar bounds have been derived in previous works 0, but we obtain a much 
better concentration which is independent of the graph's expansion due to our new bound on 
the local 2-divergence (see the second statement of Lemma 12.131 below) . 

Lemma 2.13. Fix an arbitrary load vector . Consider two rounds t\ ^ ti and assume that 
the time-interval [0,ti] is (K,l/(2n)) -smoothing. Then for any node k G V and 5 > 1/n, it 
holds that 



{5-l/{2n)f 











( 


Pr 








^ 2 • exp 










V 



[*1+1,*2] 



1/n 



In particular, for any node w S V and 5 > 1/n, it holds that 



Pr 



^5 



^ 2 • exp 



Proof. By (|2.1|) . for any node w and round t it holds that 



x 



(t) 



f (*) + v V ( Mt s+1 '*] - m[ s+1 '*] n 1 • e w 



*=1 [«:ti]eM( s > 



where t;^ = x^°\ Therefore 

E ^ l)M i 



„(*i) M [ti+i> * 21 

w.k 



E e£° + E E (mSS 1 * 1 - M ft^ l] ) • e S I • M IT M 

w&V \ 8=1 [u:v]£M( s ) 

E • M^ 1 * 1 + E E E M K 1,t2] • f M »»--' J - - m,:t, j -'-0 ■ r ( / v) 



wEV s=l [mi;] eM( s ) 



By Lemma IB. 51 and £ = x, after ti rounds we have ^2 w£ y ^ ' = x ± Jj-, where 

equaling to ~ ± ^- means within the interval fe — ^7, x + o^l • Therefore 

V x(*^M [tl + 1,ta] =x± — + V V V M [tl + 1,t2] • f M^ 1 '^ - M^ 1 '* 1 ^ • eW 

wev neV 7 s=i [ u:c ]eM( s ) 
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and 



Pr 



(ti) . M [«i+i.<a] _ ™ 



E *8° • M 



Pr 



^ Pr 



EE E M K 

ueV s=l [u:v]£M.( s 1 



M' 



M: 









) 2n 





E E E M l* 

s=i [«:»]eM( s ) wev 



[ti+l,ta] . f M [s+l,ti] _ mIs+Mi]^ 1 . 



M: 



^ 2 • exp 



{5-l/{2n)f 



M!! 1 + 1 ' fa ^fMif+ 1 '* ll -Mfe+ 1 ' tl1 



2 ' 



where the last inequality follows from Lemma 12.121 Further, 

V V |V M [t \ +1,ta] • (mJ- s+1 >^ - M^ s+1 ^ 

s=l [u;»]fM( s ) Vwey 



2 ti 



= E E K 

S=l [n;l)]eM(») 

< 2 • e (mJ; +1a] - 1 



[s+lM _ A/f [s+l,t 2 ] 

k iv vfc 



n 



where the last inequality follows from Corollary 12. 11[ Therefore, 











( 


Pr 








^ 2 • exp 






w€V 




V 



{5-l/{2n)f 



2 ' 



which finishes the proof of the first statement. The second statement follows directly by using 
the first statement with t\ = ti, since m[* 2+1 '* 2 ] = I. ■ 

Based on the upper bound on ^(-M), we obtain the following theorem: 

Theorem 2.14. Let G be any graph. Then, the following three statements hold: 

• Let Ai = (IVlW, M^ 2 ), . . .) be any sequence of matchings. If x^ = then for any 

round t and any 5^1, it holds that 



Pr 



max 



,(*) _ t(t) 



^ ■ log 



^ 2n 



-5+1 



• In the balancing circuit model, we reach a discrepancy of \J\2 log n + 1 after T cont (K, 1) = 
0(d ■ prxp^l ) roun ds with probability at least 1 — 2n~ 2 . In the random matching model, 
we reach a discrepancy of \J\2 log n + 1 after T COQt (K, 1) = O ( ) , with probability at 
least 1 — 2n~ 1 . 

• Consider the random matching model with x^ = If the initial load vector x^ has 
discrepancy at most K, then with probability at least 1 — 2?t, _1 , 



sup max 



r(*) _ £(*) 



^ v/4 • (6 log n + log log K) + 1. 
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The first statement of Theorem 12.141 states that even if an adversary specifies the matchings 
for all rounds, it is not possible to achieve a deviation of more than 0(yTog~n) between the 
discrete and the continuous case (for fixed round and fixed node). 

Proof of Theorem [2A4\ Recall by ((23J) that 



.(*) 



fit) = V V f m[ s+1 ''1 - m! s+1 '*] 

s=l \u:v]&A( s ) 



Applying Lemma 12.121 with gfc 

v — 

Mlf+ 1 '* 1 - M [ v % ht] yields for any w £ V 



Pr 



fit) 



> ^25 - log n-^ 2 (M) 



€ 2n 



-5 



Taking the union bound over all n nodes and recalling the bound \&2(A4) v2 from Theo- 
rem [2TT0] completes the proof of the first statement. The second statement follows directly from 
by the first statement (5 = 3) and the definition of T COQ t(K, 1). 

For the proof of the third statement, we require a general estimate on Aa(P). By Cheeger's 
inequality, A2(P) ^ 1 — $(P) 2 /2, where <3?(P) is the conductance 36(] defined as 



$(P) = min 1 v ' 
v ' scv-. 2A • \S\ 

0<|5|s£n/2 

Since G is connected, <J?(P) ^ ^ and Cheeger's inequality implies that Ag(P) ^ 1 — -^z- 
Using Theorem 12.51 it follows that in the continuous case, the discrepancy is at most 1 after 
t := O ( ^^2(P) ) = 0{\og{Kn) ■ n 4 ) rounds with probability at least 1 — ra" 1 . Using the first 
statement with 5 = 6 + log log Kj log n, it follows by the union bound over the time-interval 
[1, r] that 



Pr 



max max 
te[o,r] wev 



M _ fit) 



^ ■ log 



n 



^ 1 - (log (ifn) 



n 



• 2n 



-5+1 



^ 1 



n 



Combining the three insights that (i) the the maximum load is non-increasing and the minimum 
load is non-decreasing, (ii) the discrepancy of the continuous case is at most 1 in round r (w.p. at 
least 1 — re -1 ) and (iii) the maximum deviation between the discrete and continuous process 
is at most \J 4<5 • log n in the time-interval [0, r] (w.p. at least 1 — we conclude that the 

maximum deviation between the continuous and the discrete case is at most v 7 4(5 • log n + 1 for 
all rounds. This proves the third statement and finishes the proof of the theorem. ■ 



3 Token-Based Analysis via Random Walks 

In this section we introduce our technique which relates the movement of the tokens to indepen- 
dent random walks. In Section [3.11 we formalize this relation and derive strong concentration 
results for the load distribution on a subset of nodes. In Section [3.21 we use these new concen- 
tration results to analyze the discrepancy on arbitrary graphs. All our results in this section 
will hold for arbitrary for the balancing circuit and random matching model. 

3.1 Bounding the Load via Random Walks 

We now present our new approach that allows us to upper bound the load of a node by assuming 
that the tokens perform independent random walks in every round. Throughout Section 13.11 
we assume that the load vector is non-negative. 
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Let T = {1, 



-(0)1 



l} be the set of all tokens, which are assumed to be distinguishable 



for the sake of the analysis. The tokens may change their location via matching edges according 
to the following rule: If two nodes u and v are matched in round t, then x*£ + x£ ^ tokens, 
which are located at node u or node v at the end of round t — 1, are placed in a single urn. 



After that, if $u V = 1, then node u draws 



tokens from the urn uniformly at 



-1, 



random without replacement and node v receives the remaining tokens. Otherwise, 

tokens from the urn and again node v receives the remaining 



and node u draws 



tokens. We observe that each individual token which is located at node u or v at the end of 
round i — 1 is assigned to either u or v with probability 1/2. Note that this token-based process 
performs exactly in the same way as the original randomized protocol introduced in Section [2j 
We now prove that every token viewed individually performs a random walk with respect to 
the matching matrices. Henceforth we use to represent location (the node) of token i G T 
at the end of round t. We also use the the notation that for any n by n matrix M, any node 
u G V and subset D C V, M. U) rj := J2 v eD M«,_d. 

Lemma 3.1. Fix any non-negative load vector at the end of round t\ and consider a token 
i 6T located at node u = wf 1 ^ at the end of round t\. Then for any t2 ^ t\, 



Pr 



w 



(*a) 



]y[[*i+l,*2] 



and more generally, for any set D C V , 

Pr 



wf 2) g D 



M [*i+l.*al. 



Proof. We prove by backward induction on t that for an arbitrary pair of nodes u, v £ V and 
round t G [^1,^2], the probability for a token which is at node u at the end of round t to be at 
node v at the end of round ti equals m|| 1|<8 1 Since M^ 2+1 '* 2 ] is the identity matrix, the claim 
is trivially true for t = t%. Consider now any round t\ ^ t < t% so that the induction hypothesis 
holds for t + 1 and let i be a token at node u at the end of round t. If node u is not part of 
the matching in round t + 1, then the induction step holds trivially. So suppose that node u is 
matched with a node k in round t + 1. Since tokens are spread uniformly, it follows that the 
tokens at node u will be either at node u or node k with probability exactly 1/2, regardless 
whether the sum of tokens at u and k is even or not. Hence, 



Pr 



w 



(*a) 



V I U) \ = u 



1 



Pr 



w 



(*a) 



v w 



u 



H Pr 

2 



Using the induction hypothesis, it follows that 



Pr 



(*a) 1 (*) 



W : 



(ta) 



V \ W) = K 



M( i+1 ) • M[* +2 '' 2 l + M (t+1) • M [t+2 '* 2] 



which completes the induction. The second statement of the lemma follows immediately by 
summing over all nodes in D. ■ 

The next lemma is the crux of our token-based analysis. It shows that the probability that 
a certain set of tokens will be located on a set of nodes D at the end of round £2 is at most 
the product of the individual probabilities. This negative correlation will enable us to derive a 
strong version of the Chernoff bound (see Lemma [ 
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Lemma 3.2. Fix any non-negative load vector at the end of round t\ and let B C T be an 
arbitrary subset of tokens. Then for any subset of nodes D C V and round ti > t\, it holds that 



Pr 



A K 2) * D 



wf 2) G D 



Proof. We only have to prove the inequality, as the equality follows directly from Lemma 13. 11 
Assume for simplicity that all tokens in B are numbered from 1 to k := \B\. To simplify the 
notation, we define for any token i £ B and round t, t\ ^ t ^ t2, 



At) .. 



M 



[t+i,t 2 ] 



and 



i=l 



Our goal is to prove that the sequence t ^ t\ forms a supermartingale with respect to the 
sequence of load vectors x^ 1 ^, . . . , i.e., 



E 



Assuming that (|3. 1 j) holds, we can deduce the statement of the lemma as follows: 



(3.1) 



E 



Z (t2 



^ E 



k k 

(*0 _ TTi\/r[ i i+ 1 >*2] 

^ (tl) ,D ' 



Z (*0 =z (ti) = JJ z (*i) = JJ M l*i+ 



By definition, 



= TT M (* 



i=l 



which is one if wf 2j G D for all i € B and zero otherwise. Therefore E [ ] = Pr [ Z(* 2 ) = 1 ] = 
Pr Ai£B( w f 2 ^ e D) , and the proof is complete. 

It remains to prove (|3.ip . To this end, fix the load vector x^" 1 ) and partition the set of 
tokens B = {1, . . . ,k} into disjoint sets Si, 1S2, • • • , Sfc' with 1 ^ k! ^ k so that every token in Sj 
has the same set of possible destinations at the end of round t. Since tokens with different sets 
of possible destinations behave independently in round t, it follows that 



E 



z^l^*- 1 ),...,^) 



E 



niL 



(t) 



3=1 ieSj 



k' 

n E 

3=1 



n 



-(*i) 



(3.2) 



Hence in order to prove (|3.ip it suffices to prove that for every fixed j G {1, . . . , k'}, 



E 



n 



X 



(t-1) 



n 



y (t-i) 



(3.3) 
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Consider first those sets Sj so that every token i G Sj has only one destination, meaning that 
node wf ^ is not incident to any matching edge in round t. In this case, clearly we have 
wf^ = wf ^ and hence zf 1 = zf ^ , and consequently (|3.3p holds. 

The second and more involved case concerns those sets Sj so that every token in Sj has 
two possible destinations at the end of round t, denoted by u = u(J) and v = v(j), with 
{u,v} G MW. Assume for simplicity that M^ 1 '* 2 ' ^ M^ 1 '* 2 ' and tokens in Sj are numbered 
from 1 to 7 = \Sj\. Then for every token i G {1, . . . , 7}, define a random variable X{ as follows: 

J 1 if token i is assigned to node u in round t, 
I if token i is assigned to node v in round t. 

Our claim is that the (random) vector X = (Xi, . . . ,X~) G {0, l} 7 satisfies the negative re- 
gression condition (cf. Definition |A.4|) . i.e., for every two disjoint subsets C and TZ of Sj, and 
non-decreasing function /: {0, 1}' £ — > R, it holds that 

E[f(X q ,qe£) I X r = a r ,r£TZ] 

is non-increasing in each a r G {0, l},r G 7£. To establish this, it suffices to show that 

E[f(X q ,q€C) I I r = a r ,reK]^E[/(I„ge£) | I r =5 r ,r€K], (3.4) 

where a r = a T for every r £ TZ except for one r' £ TZ, where 5y > a T i. To prove the above 
inequality, we use a coupling argument. We expose the locations of the tokens in Sj one after 
another in an arbitrary order. In particular, we may expose the destinations of tokens in Sj 
before considering the other tokens (the ones not in B) which are located on u and v at the 
beginning of round t. Note that for every token i G Sj, the probability of being placed at node 
u (or v) depends on the placement of the previous tokens. In fact, the exact probability is not 
important here, instead we shall only use the fact that the probability for a token to be assigned 
to node u is non-increasing in the number of tokens that have been assigned to u before. For 
any 1 ^ i ^ 7 + 1, let a(i) be the number of tokens in {1, . . . , i — 1} which are assigned to node 
u. Hence if we associate to every token i S Sj a uniform random variable U{ £ [0, 1], then there 
exists a threshold function T(i,a(i)) £ [0, 1] satisfying the following properties: 

1. if Ui ^ T(i, a(i)), then token i is assigned to node u, 

2. if Ui < T(i, a(i)), then token i is assigned to node v, 

3. T(i,a(i)) is non-decreasing in a(i). 

Without loss of generality assume that Sj = {1, . . . , 7}, TZ = {1, . . . , r}, C = {r + 1, . . . , r + £}, 
r + £ ^ 7. Recall that fj3.4f) involves two conditional probability spaces, one for X r = o T , r G TZ 
and the other one for X r = Sy,r G TZ. We denote these probability spaces by Q and Q, 
respectively. 

Since these probability spaces are only conditional on the placements of tokens in TZ, we can 
couple both probability spaces by assuming that the random variables Ui attain the same values 
for every i £ C in $7 and ft. Further, let us denote by a(i) the number of tokens in {1, . . . , i — 1} 
which are placed on node u in fi. Then, the values U (i G C), a(r + 1) and a(r + 1) determine 
the placement of all tokens in C for the two probability spaces. By assumption on a, we have 
a(r + 1) G {5(r + 1) - 1, a{r + 1)}. 

Now by the three properties and the coupling described above, if for some i G C, a(i) = a(i), 
then a(i + 1) = a(i + 1) and all further tokens in C are placed in the same way in both 
probability spaces. Additionally, if for some i G C, a(i) = a(i) — 1, then it follows that 
a(i + 1) G {a(i + 1) — 1, a(i + 1)} by the monotonicity property of the threshold. This means 
that every token i G C that is placed on u in Q will be also placed on u in f2. Since / is 
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non-decreasing in each coordinate (which corresponds to the placement of one token in C), the 
coupling argument above establishes ()3.4p . 

By (|3.4|) . the vector X = (X±, . . . ,X 7 ) satisfies the negative regression property. Then for 

any i £ {1, . . . ,7}, define a random variable h(Xi) as follows: h(Xi) = M^J^ if X% = 1 and 



h{Xi) = Mjjv 1 ^ 1 if X 4 = 0. By the choice of u and u, we know that h{Xi) is non- decreasing in 
every coordinate. Hence, 



E 



n 

ieSi 



.(t-i) 



.(*i) 



E 



n ^ 



Lemma I A, 51 



Lemma 13. II 
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2 "(i).- 2 "0')>-D 



Since M 



[t,t 2 ] 



M J*? v., we arrive at 



S;(j),D' 
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(t-i) 
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< II II 



(t-i) 



ieSj 



ieSj 



Applying this to (|3.2p for all j G {1, . . . , fe 7 } implies that 



E 



Z^\x^\...,x^] =fjE 



(t) 



.(ti) 



nn 

i=i ieSj 



showing that Z^' is indeed a supermartingale. This establishes (|3.ip and finishes the proof of 
the lemma. ■ 



Combining Lemma 13.21 and Lemma \A.2\ we obtain directly the following Chernoff bound: 

Lemma 3.3. Fix any non-negative load vector a;(*0 a f the end of round t\ and let T he the 
set of all tokens. Let D be any subset of nodes and ti > t\. Then for the random variable 



eD 



Sue 



„(*a) 



it holds for any 5 > i/iaf 



Pr [Z > (1 +<J)E[Z]] < 



E[Z] 



The strength of Lemma [3.3l is that the sum of loads is analyzed by means of a sum of indicator 
random variables over all tokens instead of a sum of rounding errors (e.g., Lemma l2.13p . For 
an illustration of the power of Lemma 13.31 we consider the following toy example. 

Corollary 3.4. Let x^ be any non-negative load vector with ||a;( )||i ^ n 1_E , where e > is 
a constant. Then the discrepancy after r cont (l, n -1 ) rounds is at most 9/e with probability at 
least 1 — 2 • n . 

We can think of the allocation of the tokens in terms of the famous balls-and-bins 

model 30(]. If we run our randomized protocol for sufficiently many rounds, say T con t(l,n~ ) 
rounds, then every token (corresponding to a ball) is located at any node (corresponding to a bin) 
with almost the same probability. While in the standard balls-and-bins model, the allocation 
of different balls are mutually independent, Lemma 13.21 established that in our model, these 
allocations are negatively correlated. Therefore, as it is the case for the balls-and-bins model, 
we obtain a constant maximum load if the number of tokens is bounded by n 1 ^ 6 . 



19 



Proof of Corollary \3.Ji\ Fix any node u £ V. By definition, it holds for t := T con t(l,n~ ) that 
the time-interval [0,t] is (1, n~ ^-smoothing with probability at least 1 — n~ l (this probability 
is even 1 for the balancing circuit model). Consider Z := X^st^ Assuming that [0, t] 

i 

is (1, n~ ^-smoothing, Lemma IB.5I implies that every token is located at any node u with 
probability at most 2/n. Hence E \ Z\ ^ 2/n ^ 2n e < 1. Applying Lemma 13.31 with 



8 

£ ' WW 



> 9 yields 



Pr [Z > (9/e)] < 8- SE ^ 2 ^ i^^^ 



< n' 2 , 



where the inequality above holds for sufficiently large n. Taking the union bound over all nodes 
yields the claim. ■ 

The next lemma provides a concrete tail bound which is not just exponential in the deviation 
from the mean but also exponential in the "sparseness" of the load vector. By contrast, previous 



analyses expressing the load as a sum of rounding errors |8|, 113, 120) [29j, 135J , yield weaker tail 
bounds for sparse load vectors (cmp. Lemma [2. 13|) . Another advantage of Lemma 13.51 is that it 
gives a tail bound for an arbitrary convex combination of the load vector. 

Lemma 3.5. Fix any non-negative load vector x^ with Hsc^'Hi ?S n ■ e — ( logn ' for some 
constant a G (0,1). Moreover, consider a round ti > t\ so that \ti,t<2\ is (n,n -3 ) -smoothing 

and let Z := ^2 ve yy v Xv 2 \ where y is any non-negative vector with \\y\\\ = 1. Then for any 
5 > 0, 



Pr 



Z > e->s*>* + 8IMU • (logn) 5 ] < e -Q°*»)>+*/^ 
Proof. Let a := Hyll—,. Partition V into at most 21og 2 n groups defined as follows: 
Si := {v € V: 2- 1 - 1 < y v < 2"*} , \teg 2 (l/a)\ < i < \2log 2 n], 
S r2 i og2 nl := {veV:y v ^2-^°^}. 

Clearly, |Sj| ^ 2 l+1 , which also holds for S^2\og 2 n] as there are only n nodes in V. In order to 
obtain a bound on Z, we will upper bound the following approximation of Z: 

\2 log 2 n\ 

i=Li°g 2 (V a )J «es< 

Note that Z ^ Z. We now apply our technique our relation between the movements of tokens 
and independent random walks to upper bound Z. We do this by considering the contribution 
from each Sj individually. Since [ti , ^2] is (n, n _3 )-smoothing, Lemma IB. 51 implies that every 
token is located at any node in round i 2 with probability at most 2/n; thus, every token is 
located at a node in Si with probability at most 2|Sj|/n. 

Suppose first that |Sj| #s e2^ ogn ^ CT . By Lemma 13.21 the probability that we have more than 
4|Sj| • e~4^ ogn ^ tokens on nodes in Si in round ti is upper bounded by 

„..-*..)■ \ fns.iv^"""' /e-e-^V^"""' , _„„„,„.,„., 



4|S,-| -e-2 (logn)c 



< = < e" 



Next assume that |SiK e2( logn ) CT . For every token j G {1, . . . , ||x^||i}, let X^j = 1 if token 

112.(0)11 

j is located at a node in Si in round ti and Xjj = otherwise. Let Xj := z2j=i Xij' Then, 



z<°> 



2 • IS,; 



1 n 



< e -(logn) CT . 2e f (l°g™) CT = 2 e -5( lo g n )° 
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Using the Chernoff bound (Lemma 13. 3|) . 

Pr[X, > (1 + 0)E[X 4 ]] < [ {1 + f3)1+ , ) < r mXi]/2 , 

where the second inequality holds for any f} ^ 9. Here, we choose (5 := (logn) 5 /E [Xi] ^ 9 to 
conclude that 

Pr [Xi > 2- (logn) 5 ] < (EfX,])^^^ ^ (Vi^) CT ) (1 ° gn)V2 ^ ^(log^A 

assuming n is sufficiently large. 

By the union bound over at most [2 log 2 n\ groups, we conclude that with probability at 
least 1 - [2 logon] • maxfe-^ " 8 "''^-^")^^} > i _ e -(iogn)'+V6 j 

[2 log 2 n] [2 log 2 n] 

z < E E^ 2) - 2 ~* = E ^- 2 " J 

i= Llog 2 (l/a)J i>eS 4 i= Llog 2 (l/a)J 

[2 log 2 n] 



y 2-» .( 4 N +2 . to 

i=Llog 2 (l/a)J Ve4 

|"21og 2 n] +] r21og 2 n] 



V 2-*- \' 2 \ + V 2- i+1 -(logn 



5 



^ p| (log n)° 

i=Llog 2 (l/a)J ' i=Llog 2 (l/a)J 

64 logo n .£ 

e |(logn) CT 

^ e -i(logn)" +8a . ( logn )5. B 

3.2 Bounding the Discrepancy in Arbitrary Graphs 

Throughout this subsection, we assume without loss of generality that G Z n is any initial 
load vector with x G [0, 1) (cf. Observation 12.61 for a justification). Let us fix any value e > 0, 
not necessarily constant. Then define the following set of vectors for any £ ^ 1: 

£f.= JxGZ": ^max{x u -8£- [(logn) 2 ] -£,0} ^ 4n • e -\-(^) te I. 

I u€V J 

Roughly speaking, includes all load vectors whose number of tokens above the threshold 
8£ • [~(logn) £ ] + I is not too large. In particular, for any load vector x G Ei, £ ^ [2/ e l> the 
maximum load of x is at most 8£ • [~(logn) e ] + 1. 

The next lemma shows that if we start with a load vector in then the load vector after 
T cont (l, n -2 ) rounds will be in Ei with high probability. 

Lemma 3.6. For any integer t ^ 2, t G N, e ^ 16/(loglogn) and any vector x G Ee—i, 

>l_ e -*0W-„-i 



Pr 



x ( ' +k) G £> I = x 



where k := r con t(l,re 2 ). Furthermore, Pr [x^ G£i] ^ 1 — e 4( lo s n ) e _ 2>n 1 , i/ k : = 
r cont (K,l/(2n)). 

Let us briefly describe the key steps for the proof of Lemma 13.61 The proof that x^ G E\ 
(with high probability) makes use of the concentration inequality for the sum of rounding errors 
(Lemma I2.13| second statement), which in turn is based on our upper bound on the local 2- 
divergence (Theorem I2.10p . The proof that, starting with a load vector in £e-i, we obtain a 
load vector which is in Eg after k additional rounds is based on our new concentration inequality 
Lemma 13.31 



21 



Proof of Lemma WM Recall that we assume here that x G [0, 1). Let us first consider the event 
x( K ) G E\. Consider the following potential function in round k: 

2 / 



$(") := J^exp ((x^ -x 



16 . 



Since k = T con t(K, l/(2n)), it follows that with probability at least 1 — n , the time-interval 
[0, k] is (K, l/(2ra))-smoothing, which we will condition on for the remainder of the proof. By 
the second statement of Lemma 12.131 it holds for any node u G V and any 5 > 1/n that 



and therefore 



E 



Pr 



$( K ) 



> 5 



^ 2exp 



^ n ■ max E 



exp l l x$ - x 



5 

2n 



16 



(3.5) 



n ■ max 



k=l 

^ n • max 2 + > Pr 



exp l l x[ K) - x 



16 } > k 



n ■ max 

u&V 



k=3 

oo 



2 + E Pl 



fc=3 



x[f> - X 



16 ^ log k 



^ 4-y/log A; 



Combining this with (|3.5p . we get 



E 




^ n • max 2 + 









fc=3 



1 

2n 



n- 2 + ^2 



-3 log k 



^ 4n, 



fc=3 



where in the last inequality we used the fact that ^^_ 3 2& 3 ^ | ^^=3 k 2 ^ 2. Hence by 
Markov's inequality, 



Pr 



> 4n-e2- (logn)e 



^ Pr 



$0) > e r( lo s n ) £ _e <j)( K ) < e -r(i°g n ) £ 



(3.6) 



Furthermore, recall that by Theorem 12.141 the maximum load at the end of round k is upper 
bounded by ^12 log n + 2 with probability at least 1 — 2n _1 . In the following we tacitly assume 
that both $M ^ 4ra • e 2' (logn)E and x^l x ^ ^/l21ogn + 2 hold. Under this condition, the total 
number of tokens above the threshold 8 • [~(logn) e ] + 1 in round k is upper bounded by 

4n-eH logn)e 



max |a4 K) - 8 • [(log n) £ ] - 1, o} 



12 log n + 2 ^ 4n • e 



e 4-(logn) 2e 

where the second last inequality is due to the fact that 

exp ((8 • \(logn) £ ] + 1 -x) 2 /l6) > e 4 - (logn)2 \ 
Combining (|3.6p and ()3.7p we have 



-(logn) £ 



, (3.7) 



Pr 



max - 8 • r(logn) e ] - 1, o} ^ 4n • e - (logn) 



€ e 4 



jr(logn) e 
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which implies 



Pr 



G S x 



> 1 - e 



-~(logn) E 



completing the proof of the first statement. 

For the second statement with I ^ 2, we consider the probability space conditioned on 
x (t) _ x g To simplify the notation, we will ignore this condition in the following 

probabilities and expectations. For any round s ^ t, define another load vector as follows: 



~(s) 

x), ' := max 



{^)-8(^-l)-r(logr0H-(^-l),0}. 



Due to Observation 12.61 we can use to upper bound the load of xffl by using the inequality 
Xu ^ x$ + 8(£ — 1) • [(log n) £ ] + (£ — 1), valid for any «ey and s ^ t. Therefore, we focus on 
the (non-negative) load vector x^ in the following. Fix an arbitrary node u G V . Let Zj be the 
0/1-indicator random variable for every token i £ T, which is one if and only if token i reaches 
node u at the end of round t + k. Let /3 := ||x^||i and Z := X^f=i Z- Clearly, xi* + ^ = Z. 

Note that for token i which is located at node wf* at the end of round t, we have by 
Lemma 13.11 that 

Pr[Zi = l] = M l *t}' t+K] . 

Since k = T con t(L n ~ 2 ), the time-interval + is (1, n~ 2 )-smoothing with probability at least 
1 — n , which we will assume in the following. Hence Lemma [B.5I yields for every pair of nodes 
v, u G V, 

M [t+l,f+«] < i , J_ 
n n z 

By Lemma 1331 we can upper bound Z as follows: For any 5 > 0, 

„<5 \ E [ Z ' 



Pr[Z ^ (l + i5)E[Z]] ^ 
By definition of the set £e-x, 



(1 + 5) 



1+5 



e\S-E[z) 



(3.8) 







(t) 



i = max {4 4) - 8(^-1)- Klogn) e l -(£-l),o} An ■ e -H lo s™) ( ' 



Hence, for sufficiently large n, 



E 



•t=i v 7 



-i). e 



(3.9) 



Since E[Z] ^ 1, 



Pr 



x 



(t+K.) 



^<5E[Z] + 1 =Pr[Z^5E[Z] + l]sCPr[Z^(l + 5)E[Z]]. 



Choosing 5 = 5(a) = wryi ■ (8 • (logn) £ + a) (for any integer a ^ 0) in (|3.8p . we obtain 



Pr 



x(t+«) 



> (8- (logn) e + a) + l 



-(8-(logn) e +a) 



e[z; 



-(8-(logn) £ +a) 



by 



1 



s$ exp ( ~ • (logn)^ 1 ^ • 



(logn) £ + a) , (3.10) 
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where in the second inequality we used that by our lower bound on e, 8 • (logn) e ^ 5. Our goal 
is now to bound the number of tokens in the load vector x( t+K ) that are above the threshold 
8 • [~(logn) £ ] + 1. To this end, define the potential function A(' +K ) with respect to load vector 



by 



A (t+«) := max^f^ - 8 • r(logn) £ ] - l,o} 



Then we can upper bound the expectation of A^ t+K ^ as follows: 



E 



A (t+*)1 =^^Pr 

u<=V a=l 



max ■ 



■j,EV a=l 
by EM °^ 

^ Z^Z^ ex P 

Since \ ■ (logn)^ -1 )' e ^ 1 we have 



{^ +K )-8-r(logn)H-l,0} 

Yl Pr [ S « +K) > 8 • r( lo e n ) £ i + 1 + a 



1,01 > a 



-(log n 



,(*-!)* . 



(logn) e + a) 



E 



A (t+K) 



^ri-exp(-(logn)^- £+£ 

\f-e 1 



a=l 



^ n • e" 



-(logn) f 



1-e- 



< 4n-e" (logn) 



Using Markov's inequality, 



Pr 



^ e 4 



3 (logn) f 



Assuming that A^ t+K ^ ^ 4ra • e~i( lo s n Y~ occurs, it follows by the definition of A(* +K ) that 
A (t+iO = ^ max {x£ +K) - 8 • [(lognf] - l,o} < An ■ ^1^^ . 

Since x { * +K) < + 8(* - 1) • [(logn)^ +(£- 1), 

max {xl i+K) - 8£ ■ [(log n) £ ] - t, o} < An ■ e'^^" 

Therefore, 



Pr 



which finishes the proof. 



x. 



> 1 _ e -!(logn)'- > l_ e -i(logn)^ 



Iterating Lemma 13.61 reveals an interesting tradeoff. First, we obtain a discrepancy of 
C((logn) e ) for an arbitrarily small constant e > by increasing the runtime T con t(K, n -2 ) 
only by a constant factor. Furthermore, choosing e = 0(l/(log log n)) and I appropriately, we 
obtain a discrepancy of O(loglogn) by increasing the runtime by a factor of O(loglogn). 

Theorem 3.7. Let G be any graph and consider the random matching or balancing circuit 
model. 

• Let e > be an arbitrarily small constant. Then after 0(T cont (K,n~ 2 )) rounds, the 
discrepancy is C((logn) e ) w.p. 1 — e _ ( logn ) n<1) . 
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• After 0{ T cont{K,n 2 ) - log log n) rounds, the discrepancy is O(loglogn) w. p. 1 — , . 
Proof. By Lemma [3T6l with k := T con t(K, l/(2n)), for any vector x E £g~\ and any round t E N, 



Pr 



(*) 



> 1 - e"4 



j(log«) £ 



- 3n" 
3n _1 



(3.11) 



and the same lower bound also holds for Pr [ x^ E £ i ] . Our goal is to show that rr^' K ) is in 
Eg, where £ := [|] . Applying ()3.11j) ^-times and the union bound, 



Pr 



{&K) e£>1 ^ f e -2 (logn)E +3n~ lN ) ^ 1 



-§(logn) e 



where the second inequality holds since e and £ are constants. If the load vector is in Eg, then 

1 



u£V 



max < x 



.(*«) 



[(log n) 



'2 








e 



> ^ 4n • exp 



(logra 



rli 



< i, 



which implies that the maximum load in round ^-^^(l, n 2 ) is 0((log n) £ ). The corresponding 
lower bound on the minimum load follows by symmetry (see Lemma [2.7p . Hence with probability 
1 — e~( lo s n ) n 11 , the discrepancy in round £ ■ r cont (l, n~ 2 ) is C((log n) £ ), completing the proof of 
the first statement. 

Let us now prove the second statement. First observe that if scW G Eg for some round t, then 

also G Eg. We now choose e := log log - , £ := [log log n] and bound the number of rounds 

required to reach a load vector which is in Eg. We divide this time into phases each of which is 

of length max{T con t(if , l/(2n)),T cont (l,n~ 2 )} ^ T COQt (K,n~ 2 ) =: r. As the success probability 

p in (|3.1ip is only a positive constant for our choice of e, we have to repeat some of the phases. 

However, the number of repetitions R before we reach a load vector in Eg is stochastically smaller 

than the sum of t independent geometric random variables each of which has success probability 

l 



p. Hence by Lemma IA.31 with probability 1 — exp(— fJ(loglogn)) ^ 1 — 21ogn , it holds that 
R = 0(£), i.e., the load vector is in Eg after at most 0{£) repetitions of r many rounds. If the 
load vector is in Eg, then 



max { x i R ' T) ~ 8i ■ K lQ g n ) e l - ^ °} < 4n 



e * 



i-(logn)^ 



Plugging in the values of e and £ yields 

y] max jx^' 1 ") - 8 [log log 



n 



16 

(log n) log log ' 



[log log n],o| 



^ 4n • exp ( — — • (log n 



[log log n] 



log log n I <^ \ 



where the last inequality holds for sufficiently large n. But from the last inequality it follows 
directly that for all nodes u £ V, 



,(«-t) 



.)•;," ' < 8 [log log n] ■ [e b ] + [log log n]. 



To get the corresponding lower bound for the minimum load, we use again Lemma 12. 71 Thus 
with probability at least 1— j^;, the discrepancy in round R-t is upper bounded by 0(log log n). 
This finishes the proof of the second statement and the proof of the theorem. ■ 
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4 Proof of the Main Theorem (Theorem 11.1 



In the remainder of this paper we sketch the proof of Theorem ll.il For the ease of the analysis 
we "subtract" the same number of tokens from every node such that the resulting load vector x 
satisfies x £ [0, 1) (cmp. discussion after eq. (|'2.1|) ), As illustrated in Figure [TJ our proof consists 
of the following three main steps. 

1. Reducing the Discrepancy to (logn) £d . We first use Theorem 13.71 from Section [3] 
to show that in round ti := O (T con t (K,n~ 2 )) = O(^f^) the discrepancy is at most 
(logn) £d , where > is an arbitrarily small constant. 

2. Sparsification of the Load Vector. Since our goal is to achieve a constant discrepancy, 
we fix a constant C > and only consider nodes with more than C tokens. We prove in 
Theorem 14.21 that the number of tokens above the threshold C on these nodes is at most 
n • e - ( logn ) e in round £2 := h + ^(ttt) • The P ro °f °f this step is based on a polynomial 
potential function and exploits that the load vector in round t\ has small discrepancy. 

3. Reducing the Discrepancy to a Constant. Now we only need to analyze the n ■ 
e -(i°gn) £ tokens above the threshold C. This is equivalent to analyze a non-negative 
load vector with at most n-e -( - logn ) e tokens (Ob ser vat ion 1 2 . 6]) . We prove in Theorem 14. 31 
that in round £3 := £2 + OOrzir), there is no token above the threshold C + 1, using the 
token-based analysis via random walks (Section [3|). This upper bounds the maximum 
load; the lower bound on the minimum load follows by symmetry (Lemma I2.T[) . These 
two bounds together imply that the discrepancy in round t% is at most 2C + 2. 




Figure 1: The above diagram illustrates how Theorem 13.71 Theorem 14.21 and Theorem 14.31 are combined to prove Theo- 
rem ll.ll We assume w.l.o.g. that x S [0, 1) and consider only the drop of the maximum load (cf. Lemma l2.7l for the relation 
between upper bounding the maximum load and lower bounding the minimum load). 



Remark 4.1. All results and arguments in this section will hold for the balancing circuit model 
(with a constant d) and the random matching model as described in Section^ unless mentioned 
otherwise. In the analysis, one round in the random matching model corresponds to d consecutive 
rounds in the balancing circuit model, which ensures smooth convergence as we periodically apply 
the same sequence of d matchings. In fact, many of the complications in the proof come from 
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the random matching model, as some nodes may not be part of the randomly chosen matchings 
for a long period of rounds. 



4.1 Proof of Theorem ll.il 

In this subsection we state Theorem 14, 2 1 and Theorem l4.3l with their proofs deferred to Section 
and Section H31 respectively. By assuming the correctness of these two theorems, we prove our 
main result, Theorem ILll at the end of this subsection. 



Theorem 4.2. Let G be any regular graph and e > be any constant. Then there are constants 
£d = £d( £ ) > and C = C(e) > such that the following holds. Any load vector x^ with 
discrepancy at most (logn) £d andx € [0, 1), satisfies with probability 1 — e~( logn ) n<1) the following 
inequality in round t := O ( 



^max{a4 r) - C, o} 



< n . e -(logn) 1 \ 



The basic idea to prove Theorem 14.21 is to consider the following potential function = 
J2 ueV . s (')>ii { x u^) S and show that it drops significantly. The complete proof of Theorem 14.2 
is given in Section f4.2l 



Theorem 4.3. Let G be any d-regular graph. Let e > be a sufficiently small constant and 
assume that x^ is a non-negative load vector with \\x^ ||i ^ n-e~( logn ) £ . Then with probability 
at least 1 — hn" 1 , it holds after k := O ( j^j) rounds that \\x^\\oo ^ 1. 



To show Theorem 14. 3\ we proceed similarly as in the proof of Theorem 14.21 However, here 
we employ an exponential potential function that runs over all nodes with load larger than 1. 
Exploiting the sparseness of x(°\ we show that after O {^^^ rounds, the value of the potential 

is at most n 2 . Then we exploit the sparseness again to derive an upper bound on the collision 
probability of two tokens. From this we conclude that after /3 rounds, the potential is reduced 
by a factor of meaning that on average the potential function drops exponentially 

every O (t^\) rounds. 

We defer the proof of Theorem 14.31 to Section 14.31 and first prove Theorem 11.11 assuming the 
correctness of Theorem 14.21 and Theorem [ 



Proof of Theorem \l.l\ Let e > be the small constant required for Theorem l4.31 which in turns 
gives us a constant = Ed(s) > required for Theorem 14.21 By Theorem 13. 7\ the discrepancy 
is at most (logn) £d with probability at least 1 — e ~( logn ) n<1) in round t% := 0(— f^r^)- Next we 
apply Theorem 14.21 to prove that with probability at least 1 — e~( logn ) n<1) , the load vector x^ 
in round £2 '■= h + ^(tzt) satisfies 



™ {xl 2) - C, 0} sC n ■ e - (logn)1 ~ 



Next define for any round s ^ £2 a new load vector x^ by x\f '■= maxjxl — C, 0} for any u 6 V . 

Since by Observation 12.61 x^ s$ xi s) + C for every s ^ t2, it suffices to bound the maximum 
load of the non-negative load vector x^ for an upper bound on the maximum load of x^ . 
Since ^ n ■ e _ ( logri ) £ , we apply Theorem 14.31 to conclude in round £3 := £2 + C(x5x)» 

||a^ <3 ^||oo ^ 1 holds with probability at least 1 — 5n _1 . Hence by the union bound and the 
relation between x^ and x^\ the maximum load of x^ is at most C + 1 with probability 
at least 1 — e -( lo g n ) n(1) . The corresponding lower bound on the minimum load is derived by 
symmetry (see Lemma l2.7p . ■ 
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4.2 Proof of Theorem HH 

Throughout the proof of Theorem 14,21 we use the following potential function: 

ueV: i^'^ll 

Occasionally, we will also apply this potential function to a different sequence of load vectors 
x^',t ^ 0, and denote this by Our next observation is that $w is non- increasing in 

i. Since our protocol only transfers tokens from nodes with larger load to other nodes with 
smaller load, it suffices to show that x i— > x 8 ■ l^ii is convex, which follows from the convexity 
oix^x 8 and ll 8 - < 12 8 - ll 8 . 

The key step in proving Theorem 14.21 is to analyze the drop of the potential function <£, 
which is formalized in the following lemma. 

Lemma 4.4. Fix a constant a £ (0, 1) and let r := O ^^j^ • Then for sufficiently small 
constant Ed £ (0, 1) (independent of a) the following two statements hold. 

• For any load vector x^' at the end of round t with discrepancy at most (logn) £d , it holds 
with probability 1 — e~( losn ) n(1> that 

<£> (t+r) < n • e ~ (logn) *. 

• If the load vector is non-negative, has discrepancy at most (\ogn) £d and satisfies 
Wx^Wx < n ■ e -( logri ) CT ; then it holds with probability 1 - e -( lo s n )" (1) that 

$(* +T ) < n • exp (-(logn) 1 - 11 ^-^ 1 - 11 ^-^) . 

Since ^2 ue y max |a4* +T ^ — 10, o| cj>(*+ r ) ) the second statement of Lemma 14.41 states that 
the number of tokens above the threshold 10 at the end of round t + r is much smaller than the 
total number of tokens in round t (assuming the total number tokens is less than n ■ e~^ ogn ^ , 
which can be achieved by a single application of the first statement of Lemma 14. 4p . This 
argument can be iterated a constant number of times to obtain that the number of tokens 
above the threshold 10 • k at the end of round t + k ■ r is at most n • e - ( logn ) E for a sufficiently 
large constant k, yielding Theorem 14.21 

4.2.1 Proof of Theorem 14.21 using Lemma 14.41 

We now proceed with the formal proof of Theorem 14.21 assuming the correctness of Lemma 14.41 
whose proof is given in Section 14.2.21 

Proof of Theorem \4-S\ By assumption, the discrepancy of the load vector a;W is at most (log n) £d , 
where t := 0(T cont (K,n~ 3 )) and Ed ^ e/22 small enough (depending on the requirement of 
Lemma l4.4p . Then the first statement of Lemma S3] implies that with probability 1— e -( lo g™) n(1) ; 

max (4 t+T ) - 10, 0} < $(* +T ) ^ n ■ e -( lo s«) 1/24 . 

Let us now define the load vector x^ s ' := max {xi s) -10,0} for any s ^ i+r. By Observation ^. 61 
it holds for any s ^ t + r and node u €V that 

^ 4 s ) + io, 
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which allows us to work with the non-negative load vector x in the following. By definition of 
x and the condition on x^ t+T \ \\x( t+T ^\\ 1 ^ n ■ e _ ( logn ) 1/24 . Applying the second statement of 

Lemma |4~4"1 it follows with probability 1 — e - ( logn ) Q(1) that 

^max{if 2T » - 10,0} ^ ^ t+2r \x) < n-exp (-(logn) 1 - 11 ^!^- 11 ^ 1 ))) , 

u£V 

where a(l) := Consequently, 

max |4* +2r ) - 2 • 10, 0} < max {^ +2r) - 10, o} 

< n • exp (-(logn) 1 - 11 ^-!^ 11 ^-^)) . 

Since the sequence a(i),i G N defined by the recursion 

38 / \ 
a(i) := (1 - lie,,) - 39 ((1 - Ue d ) - a(i - 1)J , 

a(l) = Tjj, is non-decreasing and converges to 1 — lle^, it follows by the union bound that with 
probability at least 1 — k ■ e - ( logn ) n(1> , it holds for any integer k G N that 

max [x^ - k ■ 10,0} < ^ n • e ~^) a{k) . 

Further, for any e > 0, there exists a (large) constant C = C(e) > so that for any k ^ C, 
a(k) > 1 - lle d - e/2. Since e d ^ e/22, Theorem O follows. ■ 



4.2.2 Proof of Lemma 14.41 

This part is devoted to the proof of Lemma 14.41 First, we define canonical paths. 

Definition 4.5 ((l7|). The sequence V v = (Vv 1 ^ = v,Vv 1+1 \ • • •) is called the canonical path 
of v from round t\ if for all rounds t with t > t\ the following holds. If vt '■= Vv^ is unmatched 
in M(* +1 ) ; then vt+\ = vt and Vv +1 ^ '■= Vt+\. Otherwise, let u G V be the node such that 
{v t ,u} G M( i+1 ). Th en, 

• if Xvt ^ Xy) and &v^u = 1 then vt+i = vt, 

• if x^t ^ x u and &v£u = — 1 then vt+i = u, 

■J! (t) (t) , =r.(t+l) -, .7 

• if x Vt < x u and $„ liM = 1 then vt+i = u, 

• if x£} < Xu and <&1*^ = — 1 then vt+\ = vt- 

An illustration of this definition is given in Figure [2j 

Note that there are always exactly n canonical paths, which are all vertex-disjoint. We 
define canonical paths so that if two of them are connected by a matching edge, then they 
continue in a way so that the changes of the load (in absolute values) on each of the two paths 
is minimized. Also note that every canonical path performs a random walk on G, i.e., at every 
incident matching edge it switches to the opposite node with probability 1/2, otherwise it stays 
at the current node. 



Observation 4.6. Fix the load vector at the end of round t\ and consider the canonical path 

r v = {vi h) = v,v[ 

w G V it holds that 



V v = (Vy 1 ^ = v, Vv 1+1 \ ■ • •) of v from round t\. Then for any round t ^ t\ and any node 



Pr|>W=u>l =Mlf 1 + 1 '*]. 
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x Vt 



X, 



(t+1) 



Figure 2: Illustration of the four cases of Definition 1431 with {v t ,u} G M(* +1 ). The black arrows indicate the value of 
<bu^v (i- e -> the direction of the excess token (if any)) and the grey directed edge is part of the canonical path. 



Now we sketch the key ideas of the proof of Lemma 14.41 We use the polynomial potential 
function that only considers nodes with load at least 11 (see ()4.ip ). Using the condition 
that the discrepancy of the load vector is at most (logn) £d , it follows directly that the initial 
value of the potential <3? is upper bounded by an almost linear function in n (Observation |4T7|) . 
To prove that <3? decreases, we consider phases of length f3 := (logn) £i , where St G (0, 1) is some 
small constant. In each such phase, we consider the canonical paths starting from nodes with 
load at least 11 together with the canonical paths starting from nodes with load at most 9. 

To lower bound the probability that two canonical paths collide, we use the relation between 
canonical paths and random walks, i.e. as long as both canonical paths have not been connected 
by a matching edge, they evolve like independent random walks. Then, we sieve out those nodes 
with load at least 11 from which a canonical path has only a small probability to collide with 
a canonical path starting from a node with load at most 9 (Definition 14.91 and Lemma 14. 10[) . 
Using this, we establish in Lemma 14.111 that there are indeed sufficiently many collisions, i.e., 
sufficiently many canonical paths with load at least 11 that collide with a canonical path with 
load at most 9 within the phase of length /3. Clearly, not every collision between two canonical 
paths reduces the potential (since the load of one of the two canonical paths can change before) , 
but as we argue in Lemma 14.111 the decrease of the potential is essentially the number of 
collisions divided by the length of the phase (5. 

The next observation provides an easy bound on the initial value of the potential function 
defined by (|4.ip . exploiting the small discrepancy of the load vector. 

Observation 4.7. Consider any load vector x^ with x G [0, 1) and discrepancy at most 
(logn) £d . Then $(*) < n ■ ((logn) £d + l) 8 . 

Our goal is to consider phases of length (3 and prove that after each such phase, the expected 
value of the potential function drops. First, we define different conditions for a pair of node u 
and round t, which are necessary to cope with the inherent randomness of the balancing circuit 
and especially the random matching model. 

Definition 4.8. Let St > be any constant. For any node u £ V and round t £ N, define the 
following three conditions: 

• CONDl(u,t) : x { u ] ^ 11. 

< (logn)" 6 '/ 7 . 

. COND3( U , t) : E we v (m£+ W] E v& v ^M& W1 ) > 4. 



COND2(li,t) 



M 



[t+i,t+p] 
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Moreover let be the set of nodes u satisfying CondI (u, t) . The sets S^' and S^' are defined 
in the same way. 

Note that Cond2(ix, t) ensures that the local neighborhood around the node u with respect 
to the (directed) graph induced by the matchings within the time- interval [t + 1 , t + f3] expands 
sufficiently. With regards to Cond3(u, t), recall that the probability distribution of the location 
of the canonical path of u in round t + (3 is M^ +1 '* + ^. Moreover, for any w G V and fixed 

load vector x®, Ylv^v Xv^^w^v' t+ ^ is the expected load on node w in round t + /3. Hence, 
if COND3(tt,i) holds, then at the (random) location of the canonical path in round t + {3, the 
expected load is large, i.e., at least 4. We point out that although this argument gives the right 
intuition, it is not precisely true, since conditioning on the location of a canonical path affects 
the random orientations within the interval [t + l,t + /3], which in turn affects the load vector 
at the end of round t + (3. 

Definition 4.9. A node u G V is called bad in round t, if node u satisfies COND2(ii,i) and 
Cond3(u, t) simultaneously. Let 0W C V be the set of all bad nodes in round t. 

In other words, a node u is bad if despite enjoying good expansion in the graph induced 
by the matchings within the time-interval [t + l,t + /3], the expected number of tokens at the 
endpoint of the canonical path is at least 4. 

The next lemma shows that the number of bad nodes decreases exponentially in the phase 
length f3 = (log n) £t . Moreover the second statement provides a tail bound which is exponentially 
small in the "sparseness" a. 

Lemma 4.10. Let Et G (0,1), £d G (0,1) be two arbitrarily chosen constants. Fix an arbitrary 
load vector x^ with discrepancy at most (log n) £d and x £ [0, 1) . Then the following statements 
hold: 



(t) 



;(t) 



For any round t ^ T con t(n,n 3 ), 



Pr fiW < n -e~( lo s n ) £i/9 l ^ l-e~( lo s n ) £t/9 



n 



If x^ is non-negative and satisfies 1 1 cc^ -* 1 1 1 ^ n-e ( logn )' T for some constant a G (0,1), 
then for any round t ^ T cont (n, n -3 ), 



Pr 



^ n ■ e 



-(logn) 



-(logn) e t/ 17+CT _ -1 



Proof. We begin by proving the first statement. Fix any node u G V in round t and consider 
M [t+i,t+P] 2 
2 



variable in t 



. While this is a deterministic value for the balancing circuit model, it is a random 



re random matching model. By Lemma IB.41 



Pr 



M 



t+i,t+/3] 



^ (logn) 



-et/6 



< e -(logn)^/ 2 _ 



For the balancing circuit model, we replace the f3 rounds by d ■ f3 rounds which corresponds to 
the /3-times iteration of the round matrix M = Y\i=i . Hence, by Corollary IB.2I we have 



M. 



M 



1 ^ I 1 

< - + °[ n W9 

00 n \(logn) £t ' 2 



^ (logn) 



-et/6 



Let us now return to the random matching model. By Markov's inequality we get 



Pr 



I {-it eV:u does not satisfy Cond2 (it, £)} | ^ n ■ e" 



-1 (log n) £ t/ 2 



^ e 2 



i-(logn) £ t/ 2 
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that is, 



Pr 



V\S. 



it) 



^ n ■ e 2 



i-(logn) £ t/ 2 ' 



> 1 -e~a 



i(logn) £ t/ 2 



(4.2) 



Note that 5^ depends only on the random choices for the matchings within the time-interval 
[t + 1, t + /3], which is independent of the load vector and the matchings in the time- interval 
[0,t\. Since t ) Tcont{ n i n it follows that 

Pr [ [0, t] is (n, n _3 )-smoothing ] ) 1 — n _1 . (4-3) 

For the remainder of the proof, we tacitly assume that [0, t] is (n, n _3 )-smoothing. Hence by 
Lemma [2.13\ it holds for any node u 6 and 8 > 1/n that 



(5 - l/{2n)f 













/ 


Pr 






^5 


u e 


^ 2 exp 

V 



M 



[*+l,t+/8] 



Choosing 5 = 1 and recalling x ^ 1 yields 



Pr 



^4 t] MK w ^2 



^ 2 exp 



(1 - l/(2n)) 2 ^ ^ e _ (logn)et / 
4 • (logn)" £ */ 7 



Hence by Markov's inequality, 



Pr 



s. 



(t) 



e 2 



\ -(log n) £ i/ 8 



< e 2 



i-(logn) £ t/ 8 



Let 72.^ := j-u G V: X^ev ^^J 1 '^^ ) 2|. Then we can rewrite the inequality above as 



Pr 



;(t) 



e 2 



i- (log n) e «/ 8 



) 1 - e"2 



§ (log n) e «/ 8 



(4.4) 



Combining (jMD, (JO} and ftW C (ftW n S^) U (V \ s!£A gives 



Pr 



(4.5) 



Our next goal is to upper bound the size of 5g in terms of |7£W|. By definition, we have 



E Ef M ^ wl E4 t) M^wA)4. 



,(t) 



(4.6) 



On the other hand, since Xmax ^ iSL ^ (logn) £d + 1 ^ 2(logn) £d , we have 

E E U^ t+m E^ )M[ ^ t+m ) 



^E E (mE^E^mK^I+E E (Mfc^E^Mfc 1 ^) 



ueS « wev\n<t 



< 2(log n)«- • E E Mfc 1 '^ 1 E M ^ M+/?] + 2 |^ } | 



<2(logn) 6 ««- E E M ^ M+/3]+2 | 5 : 
< 2(logn) £d • IftWl +2|«S^ ) . 



(*)| 



(4.7) 
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Combining (|4.6p and (|4.7p we have 



,(t) 



< (logn) £d 



7j(t) 



Using this and 0^ C 5g , the first statement follows from (|4.5p . 

Let us now turn to the second statement of the lemma. As before, we tacitly assume that 
the time-interval [0, t] is (n, n _3 )-smoothing, which holds with probability at least 1 — n _1 . Fix 
any node u G 5o^. Note that 



t+/3] 



V V m[' +1 -* +/3 i • m[* +1 '* 



Now we apply Lemma 1331 with y„ = J2 weV MuJ, 1 '* • M^ 1 '* . Since it G 5^, the vector y 



satisfies 



|2/||oo < 



M 



[t+i,t+p] 



M J*+W] 



^ (logn) 



-et/14 



and choosing 5 = £t/16 in Lemma 13.51 gives 
PrLe #W 

= Pr 



u £ S, 



u G 5. 



< Pr 



E ( E M i*t 1 '* +/3lM K 1 ' t+/3] ) ' x v ] > e-^ (logn)CT +8(logn)- £ '/ 14 • (logn) £ '/ 



U £ S, 



(t) 



^ exp (-(logn) e '/ 16+a /6) < exp (-2 • (log n) £ '/ 17+(J ) . 
Therefore, by Markov's inequality, 



Pr 



For any round t, define to be the set of nodes u satisfying Cond1(u, t) and Cond2(u, t) 
and let 

V® :=S®\B®, 

be the set of nodes u satisfying Cond1(k, i) and Cond2(u, t), but not COND3(ti, t). Intuitively, 
every node in has (i) large load, (ii) expanding neighborhood and (iii) small expected load 
at the location of the canonical path of u in round t + (3. Hence if there are many nodes in 
V®, then we would hope for a decrease in the potential <£, which is formalized in the following 
lemma. 

Lemma 4.11. Let Sb,Ed and St be three arbitrary constants in the interval (0,1). Fix a load 
vector with x G [0, 1) and discrepancy at most (logn) £d . Assume that 



^ n ■ e" 



-(logn) £ t/ 17+c 



< e 



-(logn) £ t/ 17 +° 



V, 



(t) 



1 

> - 
2 



(t) 



n ■ e 



. p-(logn) 6 



Then it holds in round t + f3,f3 : = (logn) £t , that 

1 



E 



Sj max < 1 



18(logn) £ *+ 9£ d 



• $(*),4n-e-P° sri ) E6 • ((logn) £d + if 
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Proof. We remark that the precondition on Vi depends only on the random choices for the 
matchings in the interval [t + 1, t + but not on the orientation of these edges. Therefore, the 
orientations of the matchings in the time-interval [t + 1, t + 0\ are still chosen independently 
and uniformly at random. 

By definition, the node u G satisfies 



Let 



w 



vdV J 



Since the probability distribution of Vu + ^ , the location of the canonical path of u from round 
t in round t + /3, is given by M^ 1,i+ ^, it follows by Markov's inequality that 



Pr 



P (t+0) e g(t)j > 1. (48) 



Fix now a node ui G £7^. Our next aim is to derive a lower bound on the probability that there 
is a canonical path starting from a node k with sSl ^ 9 that reaches w in round t + /3. To this 
end, let us define := {v G V : x® ^ 9) and a w := £ „ (t) M^ 1 '*^ 1 = £ cV W m[^ M+/3] 
for node Then, 



8 £ e ^ • M S 1? 



*+/*] 



uGV 2 (t) i>GV\V 2 (t) 

> -(logn) £d -a w + W- (l-a w ), 
and rearranging yields for any w G 

2 1 

° w (logn)^ + 10 > (lognK { } 

Let us now lower bound the probability for the event that the canonical path starting from 
u G collides with a canonical path starting from a node in within the time-interval 
[t + 1, t + P\. By ()4.8p . the canonical path of it reaches a node if G £7^ with probability at least 
1/2. Moreover, a canonical path starting from a fixed node v G performs an independent 
random walk based on the random orientations as long as it does not reach a node g in round 
s G [t + 1, t + P] so that node g is matched with another node g' in round s and the canonical 
path of u is located at node g' in round s — 1. Hence, 



Pr 



3s G [t + l,t + 13]: [v^\V^] G I = > M[*+ 1 >*+^ 



i.e., the probability that the canonical path starting from a fixed node v G collides with 
the canonical path starting from u in round t and reaches node w in round t + /3, is at least 
MuJ 1 '* . See Figure [3] for an illustration. 

We now define a bipartite graph H = (V® U , E) ■ We place an edge {u, v} G E if the 
canonical path of u G collides with the canonical path of v G Kj within the time-interval 
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[t + 1, t + /?]. We first lower bound the expected number of edges in H: 



E[\E(H)\] 

= E VT[3s€[t + l,t + 0\: {7>W,pW}eMW 



ueV W «;GV 2 (t) 



E E E p 4^ 



10 



Pr 



3s € [t + 1, i + /3] : , } G M< a > | p(' 



t+/3) 



> E E E rr[v^= W ].M^^ 

ueV W ueV W weOW 



E E Pr N 



(*+/?) 



+0 



(t) 



By gj]), we know that ^ gy(t) M^ 1 ' 4 ^ 1 ^ Hence 



E[|£(tf)|] > 



(log n] 



E E Pr K +/3) 



(log n) 



V, 



(t) 



1 

2 ' 



where the second inequality follows from (|4.8p . By the precondition of this lemma, 



i 



n • e 



— (logn) e s> 



If 



^ 4n • e ( lo s n ) £il 5 then by the above inequality we have 



(t) 



sf^ . Otherwise, 



$(*) = < UP •((logn^ + lfs^n-e-^™^ • ((log n) £d + 1) 



which finishes the proof of this lemma, as & t+ ^ ^ $w. Hence it remains to consider the case 



(0 



^ 4n • e ( 1 °g n ) £i ' (and consequently, 
E(H)\] as follows: 

V[\E{H)\]> 



(0 



51 



(0 



). This allows us to lower bound 



1 



• (log n) £d 
1 

8 • (log n) £ d 

1 

9 • (logn) 9£ d 



(t) 



$(0 



1 



((logn) £ d + l) f 



(4.10) 



for sufficiently large n. 

Consider now the auxiliary graph i7 again. By definition, H contains all those edges {u, v} 
with the property that two canonical paths of u G and v G collide within the time- 
interval [i + 1, t + /?]. Fix now an arbitrary edge {u, w} G E(H) and let s G [t + 1, t + /3] be the 
round when the canonical paths of u and u meet for the first time. Then in round s these two 

(s) 

canonical paths contribute a decrease of at least one to the potential function <3? if xj, ^ 11 and 

(s) (s) (s) 

x)j ^ 9. If this does not happen, then we know that either Xu < 11 or Xy > 9 which means 
that one of the two canonical paths has contributed a decrease of at least one before round s. 
Put differently, every node in which is not isolated in H contributes at least 1/2 to the 
reduction of the potential function. Since the maximum degree of graph H, denoted by A(H), 



35 



is at most /3 = (logn) 6 *, it follows that the number of nodes in V® which are not isolated is at 
least \E{H)\/(\ogn) et . Therefore, 

_ $w» > i . w. > 

" 2 A(H) " 2 • (logn) e *' 

Taking expectations on both sides and using our lower bound from (|4.10|) on E [ \ E{H) \ ] finally 
gives 



E 



$(*) _ 



> E[ 1^(^)1, 



1 



2 • (log n) e * 18 • (log n) et+9£ d 



.$(*) 



which completes the proof. 




Finally, we are now ready to prove Lemma 14.41 by combining Observation 14.71 Lemma 14.101 
and Lemma 14.111 



Proof of Lemma \4-4\ We begin with the first statement. Let t\ := t + T cont (n, n 3 ) = t + 
O (j^x). By Observation O it holds that 



$(*i) ^ $09 <; n . ((l gn) £[i + l) 8 . 
For any round s 6 [£i,ii + logn], let £ s be the event defined by 



S, :-- 



(«) 



1 

> - 
2 



,(.) 



n • e 



-(logn) e 



where e b := e t /9. Further, let £ := /\ s 1 J t ° sn £ s - We are interested in a lower bound on Pr [ £ ] . 
First, by the first statement of Lemma 14.101 f° r anv round s ^ t\: 



Pr 



^ n • e" 



-(logn) 



> 1-e" 



-(logn) e */ 9 _ n -l_ 



Moreover, for any s ^t\, Lemma lB.41 gives for any u G S\ 



Pr 



M 



[s+l,s+(logn) £ t] 



^ (logn) 



-et/7 



< e -(logJi)' 



t/2 



36 



Hence E S[ 2 > ^ S[ 
by Markov's inequality, 



1-e 



-(lognft/ 2 ^ 



and E 



S« \ Sg 



. e -(logn) £ t/ 2 _ Then 



Pr 



'12 



1 

<2' 



and consequently 



< Pr 



Pr 



$W \ 5« 



I . P (logn)^/ 2 . E 



5 W \ 



12 



< 2e 



5 



1 

>5' 



si 



> 1 - 2 ■ e" 



-(log n) £ i/ 2 



-(log n) £ t' /2 



(4.11) 



Since := <S^ \ £>^ s \ we conclude by the union bound that with probability at least 1 

e -(logn) e i/ fl _ n -l _ 2 . e -(logn) £ '/ 2 > 1 _ 2 • e -(logn) £ t/ 9 



'12 



1 

> - . 
2 



51 



n • e 



-(log n) e */ 9 



Hence by the union bound over logn rounds, 

Pr [£ } > 1 - logn • e-( lo ^' E!/9 > 1 - e -( lo ^) et/1 °. 
By Lemma SH] with e b = e t /9, if ^ ^ n ■ e -( lo g n ) £ ' /10 , then 

1 ~ l8.(logV + ^ ) " ^ > ^ " e ^ (l0S " )£6 " ((1 ° gn)£d + lf 



(4.12) 



and 



E 



$(*!+/?) I A 



1 



1 



(4.13) 



18 • (logn) et+9£ d 

Our claim is that a very similar inequality also holds for E | £ J , whenever ^ 

n . e -(logn)^/ 10 _ Note that 



E 



I A 



^ Pr [£ | £ 4l ] • E 
^ Pr[£ A£ 4l ] -E 



| £ A£ tl 



I £ 



= Pr[£] -E | £ 

^(l_ e -(i° g ^ /10 ). E ~ 
where the last inequality follows from ()4.12p . In combination with (|4.13j) . we have 

<; ^l_ e -(iog™) et/10N "' 

<J (l _ e -(logn) e * /10 
1 



E 



E 



1 



< 1 



36 • (logn) £ *+ 9£ d 



18 • (log n) e *+ 9£ d 



$(*i) 



(4.14) 



as long as > n • e -( lo g n ) £t/1 °. 

Because of ()4.12[) . we will work in the following on the probability space conditioned on 
£ = A^Jtj^ n &s ■ Our next aim is to iterate (|4.14|) . For simplicity, we shall iterate the following 
inequality: 



E 



I £ 



< 1 



36 • (logn) £ '+ 9£ d 



37 



which holds for arbitrary cJh* 1 ). Applying the chain rule of expectations (7 — 1) times gives 

1 ^ 



E 



$(*l+7-/3) I 5 



< 1 



36- (logn) e '+ 9£ d 



+ 7 • re ■ e" 



- (log n) e */ 10 



Since the initial potential satisfies tJH* 1 ) ^ n • ((logn) £d + l) 8 (cf. Observation 14.7]) . choosing 
7 = (logn) 1_e * yields 



E 



< 1 



36 • (logn) e *+ 9e d 



$(*i) 



+ 7 • re • e" 



-(logn) 



t/xo 



< exp [ --^ • (logn) 1_2et - 9£d> ) • n • ((logn) 6 " + l) 8 + i • n • e -( lo s n ) et/n 



1 



1 



• exp (-(logn) 1 - 224-10 ^) • n + ~ 



n • e 



-(logn) 



Solving 1 — 2et — lOe^ = £f/ll yields £t = 23/11^ > i?or wn i cn the expected value of the potential 
in round t\ + 7 ■ (log re) £t = ti + log n is at most 

1-IOsj 



E 



$(*l+logn) I £ 



^ re • exp I — (log n 



Hence by choosing small enough and by Markov's inequality, 



Pr 



$(tl+logn) > n . e -(logn)2? 



< e -( logn ) 



n(i) 



(4.15) 



Since ii + log n = i + T cont (n, n 3 ) + log n = t + O y j^j J > taking the union bound over (|4.12p 
and (|4,15p yields the first statement of Lemma 14.41 

Now we prove the second statement of Lemma 14.41 Since the load vector satisfies 
llxC^Hi ^ n ■ e - ( logn ) CT , by the second statement of Lemma 14.101 it holds for any round s ^ t\ 
that 



Pr 



B (s) 



^ re ■ e 



-(log™)' 



t /VT+a 



-(logn) £ i/ 17 +' T _ 1 

re 



(4.16) 



For any round s G [t±, ti + logn], redefine the event £ s as 



1 



3 



(«) 



n • e 



. P -( l °g n Y b 



where e 6 := s t /17 + a, and let £ := A* 1 4!° gn ^- Combining (ICTj) with flUED , 



Pr [£] ^ 1 — log re • e" 



-(logn) £ i/ 17 +° 



2 log re • e" 



-(log n) £ t/S 



log re 



1 



1 - e 



-(logn) 



Then applying Lemma 14. 1 1 1 with £5 = et/17 + <r and (j4. 14p above gives 

1 



E 



< 1 



36 • (logre) £t+9e <i 



(4.17) 



as long as <£( 41 ) ^ re • e ( 1 °g n ) £t ' /18+<T i As we did from (|4.14p . we conclude from (|4.17p with 
7 := (logre) 1_e * that 



E 



${tl+T0) I £ 



1 



< exp ( • (logre) 1 - 26 ' -9 ^ ) • re • ((logre) £d + l) 8 + 7 • re • exp ( -(log n) £t > is ~ n 



1 



^ — • re • exp ( — (log re 



,l-2e t -10e. 



1 



) H — • re • exp ( — (log re) 



\et/19+(T 
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Solving the equation 



1 - 2s t - 10e d = et/19 + a 



gives e% = (1 — lOe^ — a) ■ i§ for which 



E 



< n • exp (-(logn) 1 -^- 10 ^-' 7 ) !- 10 ^ 
= n • exp f-(logn) 1 - 10e ' i -i ( 1 - 10ed - CT ) 



Using Markov's inequality, 



Pr 



$ (*i+logn) ^ n . exp (_( logn )l-ll e[i -§-(l-ll £[ i- CT A | £ 



«S e 



-(logri) n « 



Since t\ + log n = t + T cont (n, n 3 ) + log n = t + O yj^jj , taking the union bound over f|4.12[> 
and the inequality above yields the second statement. ■ 



4.3 Proof of Theorem l4~3l 

We start with an outline of the proof of Theorem 14.31 (see also the illustration in Figure 0|) . 
Some techniques used here are similar to the ones used in proving Theorem 14.21 For instance, 
we also use a potential function and short phases of (poly-)logarithmic length to show that 
the potential decreases. However, to prove Theorem 14.3^ we have to eliminate all tokens above 
a constant threshold, while in Theorem 14.21 it is enough to make this number smaller than 
n ■ e~( Iogn ) e , which is still polynomial in n. To achieve this improvement, we switch to a 
token-based viewpoint and exploit the sparseness of the load vector by using the approach from 
Section [3l 

First, fix a node u with x$ ^ 2 and consider the next (3 rounds. For the proof outline, 
we confine ourselves to the case where the degree of G is small (the case with larger degree is 
actually slightly simpler). Clearly, the number of nodes from which a token could meet with 
a token in Bp(u) } i.e., the set of nodes with distance at most /? to u, is at most d 2 ^ . Now if 
P is small enough, then a relatively straightforward calculation shows that the total number 
of tokens on nodes in Bp(u) is at most 16(logn) e (Lemma I4.13p . Moreover, if the nodes that 
host these 16(log n) £ tokens expand in the graph induced by the random matchings, then we 
obtain a good upper bound on the probability that any pair of these 16(logn) e tokens meet in 
round t + j3. This implies that the load along the canonical path starting from u in round t will 
decrease within the next /3 rounds. 

Formally, we use an exponential potential function defined by 

AW :=£A£>, 
u&V 

where 

A ( t) . = f eH^) 1 -^ 5 if 4*) >2 
I otherwise, 

and e > is a sufficiently small constant that will be fixed later. We shall prove that A drops by 
a factor of e^^'( 1_A ^ within (3 rounds. Since initially, A is polynomial in n with high probability 
(Lemma I4.12p . it follows that after O ^-prx^ rounds, the value of A becomes zero with high 
probability and hence the maximum load is at most one. 

We first derive some basic properties of this potential function. After that, we turn to the 
more involved task of establishing an expected drop of the potential. 

Lemma 4.12. Let be an arbitrary, non-negative load vector. Then the following holds: 
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If t ^ T cont (n, n 2 ) and ||a7^ ^ || i ^ n ■ e '' ogn ' 1 " , then with probability at least 1 — 2n 
AW 9n 2 . 

// too nodes u and v are matched in round t and Xu ^ — %v ^ j3 2, i/ien 
Ai*" 1 ) + A?" 1 ) - A« - A« > A?" 1 ) • f 1 - 



■(logra) 1 



• For any u G V, f/ie function A^f convex in Xu\ and hence A^ zs non-increasing in t. 

Proof. We start with the proof of the first statement. Since i ^ T con t(n, n -2 ), the time-interval 
[0, t] is (n, n _2 )-smoothing with probability at least 1 — n . Fix now all the matchings in [0, t] 
and consider the orientations of the matching edges in the time-interval [0,t], which, together 
with x(°\ determine the load vector x^ . Fix any node u G V and consider Xu . By Lemma [3. II 



and Lemma IB, 51 



E 



40 



< 2 • e"^"' 1 e < 1. 



By Lemma 13.31 we have for any 5 > 9, 



Pr 



xW^(l + «5)E 



,(*) 



Choosing 5 = p/E 



Hence, 



„(*) 



Pr 



for any real number p 1 yields 

p/2 



4° > 2 P 



E 



< E 



< 2 p/2 ■ e -a°g™) 1 " e -p/2. 



k=2 

oo 

< V Pr x$ > fc 



. e f(logn) 1 - e .fe 



fc=2 

oo 



< Ve" fc/8 < - < 



where the third inequality holds for sufficiently large n. Therefore, E [ A^ ] = Yluev E 
9n and by Markov's inequality, 



A 



(t) 



Pr 



A« > 9n 2 



< n . 



This completes the first statement. For the second statement, 
Ai*" 1 ) + A^ 1] - A® - A.W > Al*- 1 ) • ( 1 



AW+A^-Ar 1 )' 



A 



(t-i) 



/ A (t), A (*r 



> Ar i} • ( i 

= Al*- 1 ) • f 1 



A (*-i) / 

.pi^logn) 1 --^* -0 -!)' 



2e 



i-Oogn) 1 -^- 1 )) 



CM 



i-(logn) 1 



40 



where in the third inequality we used the fact that x$ ^ x« 1 



1 and x-u ^ %u ^ 



1. 



For the third statement, it suffices to prove that is convex which holds since x i— > 



l(logn) 1 



is convex and 2 • e 



l(logn) 1 



^ e8 (i°g n ) e - 2 5 w hich holds for sufficiently large n. 



For the remainder of the proof, define to := T cont (n,n 2 ) = O yjzjj- Roughly speaking, 
these to rounds are used to ensure a coarse balancing of the sparse initial load vector. More 
precisely, the next lemma proves that every small-sized set S contains only a polylogarithmic 
number of tokens under the condition that the initial load vector is sparse. 



Lemma 4.13. Consider the random matching model. Fix an arbitrary, non-negative load vector 

with n ■ e~( logn ) e , where e > is a sufficiently small constant. Assume that the 

time-interval [0,t] is (n,n~ 2 ) -smoothing. Then, for any subset of nodes S C V with \S\ ^ 
.i-Ciogn) 1 -* if hdds that 



4 • ea' 



Pr 



J24 ] > 16-(logn) £ 

ues 



< n 



Proof. Let us consider the total number of tokens located in S in round t. Define 



£ 1 

i=l 



Since the time-interval [0, t] is (n, n 2 )-smoothing, every token is located at a fixed node in S 
with probability at most 2/n (cf. Lemma lB.5j) . Hence we can bound the expectation of Z: 



E\Z] < 



~(o) 



2|5| 



< e 



-(log") 1 



e 2 



5 (log") 1 



< 1. 



77 



By Lemma 13.31 we have 

Pr[Z > (l + d)-E[Z]} < 

Hence choosing 5 



(1 + *) 



1+6 



: 1 s) 



15-(logn) E 

Effr § lves 



Pr[Z^ 16-(logn) £ ] ^Pr[Z^ (1 + 6) -E[Z]' 



< e 2 



15 • (log n) 



n 



where the last inequality holds for large enough n. This completes the proof of the lemma. ■ 

After these preparations, we are now able to analyze the drop of the potential function A. 
The following results essentially show that after every C(ytrx) rounds, the potential A drops 
exponentially. First we consider the case where the graph is sparse, i.e., the degree satisfies 
d < e ( lo § n ) 1/2 and after that we consider the dense case where d > e^ gn > 



Analysis for Sparse Graphs (Random Matching Model). Define for any node u G V 
and integer r, B r (u) := {v £ V : dist(tt, v) r}. For any round t € N, define the event St as 

s t := A ( E 4^16-aognH , 

ueV \v£B r (u) J 

where r := (logn) 1 / 3 . Note that if St happens, then the total number of tokens located at the 
nodes in each B r (u) is small. 
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Figure 4: The above diagram illustrates the proof of Lemma l4.15l the key step in proving Theorem l4.3l We show that the 
load along the canonical path starting from u decreases by at least one within the time-interval [t + 1, t + 0] by analyzing 
all pairs of tokens i, j within B2p(u) and show that none of them is located on the same node w at the end of round [t + /3]. 

Note that tokens k with luj,' -1-1 ' ^ B2/3M cannot intersect with the canonical path from u within [t + 1, t + /3]. 



Lemma 4.14. Consider the random matching model and let G be any d-regular graph with 
d ^ e ( lo g") 1/2 . L e t a;( ) be a non-negative load vector x^ with ||a;( )||i ^ n ■ e - ^™) 1 £ , where 
e > is a sufficiently small constant. Then, 



Pr 



t a +n 

/\£ t 

t=t 



>l-2n 



Proof. Since to = T cont (n,n~ 2 ) by definition, the time-interval [0, to] is ( n i n _2 )-smoothing with 
probability at least 1 — n . For the rest of the proof, assume that this happens. Clearly, for 
any round t ^ to, the time-interval [0,t] is also (n, ra~ 2 )-smoothing. Since for every uGV and 
e > small enough, 



(logn)!/3 



(«) 



^(logn) 1 / 3 



^ e 



,(logn) 



)l/ 2 > i ( 1 °^) 1/3 =A(logn ) 5 / 6<4 . e l. ( l ogn) 



we obtain by Lemma 14.131 and the union bound over all n nodes that Pr [£t] ^ ra~ 6 . By the 
union bound over the time-interval [to, to + n] we have Pr [At^t™^*] ^ which yields the 
claim of the lemma. ■ 

Next we lower bound the potential drop of A for load vectors x^ satisfying E%. 

Lemma 4.15. Consider the random matching model and let G be any d-regular graph with 
d ^ e ( logn ) 1/2 . Assume that is any non-negative load vector that satisfies £f Then the 
following two statements hold. 



Ifjhx^ (logn) 1 /^ then 



E 



pS*+p) < e -mo S nfn . A (*) 



where f3 := ^°f^l for a sufficiently small constant e > 0. 
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Jfj=X> (logn) 1 / 4 , then 



E 



A(*+^)l < (logn)~ e • AW, 



where f3 := (logn) /or a sufficiently small constant e > 0. 
Proof. In both cases we choose e small enough so that 2 • /3 ^ r = (log n) 



1/3 



Case 1: ^ < (logn) 1 / 4 . Recall that /3 



Og") 



-pr — . Fix any node with ^ 2. Our 

goal is to prove that the stack of Xu' tokens at node u G V disappears with high probability 
after /3 rounds. To this end, we simply consider all tokens in the set B^piu) and bound the 
probability that any pair of these tokens collide in round t + /?, i.e., share the same location in 
round t + fi. See Figure 2] for an illustration. 



.(*) 



By Corollary 12.41 it holds for any node v that 



Pr 



M [t+l,t+P] 



1 

< - + e" 
2 n 



-e((logn) 2£ ) 



> 1 



-6((logn) 2e ) 



Hence for any e < 1/4 we have \ + & -^^ n ? E ) = e-e(( lo g«) 2e ) and 



Pr 



< e 



-6((logn) 2e ) 



> 1 



-0((logn) 2E ) 



Let us now define the following event: 



< -e((logn) 2£ ) 



i.e., the event A u happens if for all nodes in B2p{u) that contain a token in round t the 
neighborhood within the graph induced by the matchings in the interval [t + 1, t +/3] "expands" . 
Since the load vector x^ satisfies £% by the precondition of the lemma, the total number of 
tokens in B^{v) is upper bounded by 16 • (logn) e . Consequently, the number of nodes with at 
least one token is also at most 16 • (logn) e . Therefore, 



Pr[AJ > l-16-(logn) e -e" 



-e((logn) 2£ ) 



1 



-e((logn) 2£ ) 



(4.18) 



Consider any pair of tokens i,j G T with wf G Bzpiu) and w^' G B2p{u). We upper bound 
the probability that token i and j meet in round t + /3 conditioned on A as follows. 



.(*) 



Pr ■ I A 



^Pr 

^ maxPr 



_ „.,(*+# 



(t+/3) 

= w 



.(*+« 



it; A Til)" ' ^' = w | A 



(*+/?) 



^maxe- « lo ^) 2£ )-l=e 



A] ]TPr 

e((logn) 2s ) 



io | A 
w | At 



(4.19) 



where the first inequality follows by Lemma f3.2l and the last inequality follows from the definition 
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of A u . Recall that since x^> satisfies St, there are at most 16 • (logn) e tokens in B2p(u). Hence 



Pr 



V 

ieT: w^eBaplu) 

jeT-. wfeB 2p {u) 



w 



(t+P) 



w 



(t+P) 



sC Pr 



V (« 



(t+P) 



(t+p) 



ieT-. w^eB 20 {u) 
jeT-. wfeB 2p (u) 

^ 256(logn) 2e 



A u 



+ Pt[^Au] 



max 

(<) 



Pr 



ieT: wl l 'eB 20 (u) 

jeT ■. wfeB 2/3 (u) 



(t+P) A (t+P) 
w) = w A w) 



w I A u 



+ e" 



-0((logn) 2 -) 



< 256(logn 



,2b . e -e((logn) 2 -) + e -e((logn) 2 -) 



-0((logn) 2e ) 



where the last inequality follows from (|4,19p . Note that in case there are no tokens in B2p{u) 
which share the same node in round t + j3, then if we follow the canonical path which starts 
from u in round t, there is at least one round t' E [t + 1, t + in which the load on the canonical 
path of u from round t is at least Xy) and is reduced by one in round t' + 1. Since two canonical 
paths which meet in a certain round t' cannot both reduce their value, we obtain by the second 
statement of Lemma 14.121 that 



E 



\(t+P) _ \(t) 



> E ^ 

> E A ^-(i 



gl(logn) 1 - 
p -0((logr i .) 2£ ) 



-e((logn) 2 ^) 



-ecaogn.) 2 -)^ , A (t) 



where the second inequality holds for e < 1/3. Rearranging gives 



E 



A(*+/3)l <; e -e((logn) 2 -) . A (t) ) 



which finishes the first case. 

Case 2: > (logn) 1 / 4 . Now we proceed similarly as in the first case, but here we have 
{3 = (logn) 48e , where e > is a sufficiently small constant. By Lemma lB.41 it holds that for 
any node u £ V: 



Pr 



We redefine A u as follows: 



M 



[t+l,t+p] 



^ (logn) 



-8e 



> 1 



A u := /\ { r 

veB 2 p{u) 



(t) 



[t+i,t+p] 



-(logn) 3 



^ (logn) 



-4e 



As sc^ satisfies St by the precondition of the lemma, the set B2b(u) contains at most 16- (logn) e 
tokens in total in round t and this event is independent of the random choices for the matchings 
in the rounds after t. Similar to (|4.18|) we have 



Pr [Au] > 1 - 16 • (logn) £ • e" (losn)3 



1 



-6((logn) 32e ) 
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As in the first case, we conclude that 



Pr 



V 

ieT: W^&B^u) 

jeT-. wf eB^iu) 



IV 



(*+/3) 



w 



(t+p) 



sC Pr 



V 

£T: wf^B^u) 
,.('), 



(*+/3) 



1/' 



(*+/?) 



+ Pr[-A; 



< 256 • (logn) 2£ • (logn)~ 4£ + e- 0((logn)32£) = o((Iogn) 
and as in the first case we conclude that E [A( t+ ^ ] <; (logre) -e • A®. 



Analysis for Dense Graphs (Random Matching Model). We now consider the dense 
case where the degree of the graph satisfies d ^ ^■°^ n ) 1 2 . This case is easier than the sparse 
case, as in this case the average load around every node is smaller than 1/2 for all nodes and all 
rounds with high probability. Hence most of the neighbors of each node have zero tokens which 
implies that as long as the maximum load is larger than 1, there is an expected exponential 
drop of the potential function within a single round. 

To formalize this, let us define for any round t G N the following event: 



u£V \v€N(u) 

Similar to Lemma l4.14| we now prove the following: 

Lemma 4.16. Consider the random matching model and let G be any d-regular graph with 
d > e( logn ) 1/2 . Let x^ be a non-negative load vector x^ with ^ n ■ e - ^ 08 ™) 1 E , where 

e > is a sufficiently small constant. Then, 

' to+n 

Pr /\ F t > 1 - 2n~ 1 . 

. t=to 

Proof. Since to = T COIlt (n, n~~ 2 ) by definition, the time-interval [0,to] is (n, n _2 )-smoothing with 
probability at least 1 — n _1 . For the rest of the proof, assume that this happens. Clearly, this 
also implies that for any round t to, the time-interval [0, t] is also (re, n _2 )-smoothing. Let us 
first lower bound Pr [Ft ] for any fixed t ^ to- Consider any round t ^ to and fix a node u G V. 
Let 

Z := £ 

veN(u) 

Since [0, t] is (re, n _2 )-smoothing, every token is located at a fixed node in S with probability 
at most 2/re (cf. Lemma lB.5|) . Therefore, 

E[Z] = \N(u)\ 

By Lemma 13.31 

Pr [Z > (1 + 5)B[Z}] < 



zWlL.f <; 2d- e-^™) 1 
re 



+ 



\E[Z] 
) <*-<SE[Z]/2 
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for any 5 > 9. Choosing 5 = c£/(4E [Z]) yields, 

Pr[Z>d/2] ^5- 5 ^ z V 2 -. 
By the union bound, 



d 



4E[Z[ 



< n 



-0,(1) 



Pr 



to+n 

A* 

t=to 



1 — n 



and recalling that with probability at least 1 — n , all the intervals [0, t],t ^ to are (n, n 2 )- 
smoothing, completes the proof. ■ 

The next lemma is similar to Lemma 14,151 from the sparse graph case. 

Lemma 4.17. Consider the random matching model and let G be any d-regular graph with 
d ^ e ( logn ) 1/2 . Assume that is any non-negative load vector that satisfies St- Then there is 
a constant 7 G (0, 1) such that E [ A( t+1 ) ] ^ (1 - 7) • A®. 



Proof. Note that 
E 



A (t) _ A (*+D > £ ^ Pr [ {n, ^} G 

u£V: v£N(u): 

<c£ J >2 4*) =0 



i-(logn) 1 e -x { u 



i.(logn)l 



C8 



ueV: veN(u): 

Since satisfies ft, we know that for every node u G V, at least half of the neighbors f G N(u) 



satisfy Xy = 0. Therefore, 



E 



re) 1 



„(*) 



• aw 



for a constant 7 G (0, 1). Hence E [ A^ +1 ) ] < (1 - 7) • A^. 
Finally, we are able to prove Theorem 14.31 



Proof of Theorem \4-3\ We give the proof for the random matching model at first (where d 
denotes the degree of the graph) and consider the balancing circuit model at the end of the proof 
(where d represents the number of matchings which are applied periodically). Corresponding 
to Lemma 14.151 and Lemma 14.171 we divide our proof into four different cases. 

Case 1: d ^ e ( logn ) 1/2 and j^y- ^ (logra) 1 / 4 . By the first statement of Lemma [4.121 it 

holds with probability 1 - 2n~ x that A^ ^ 9n 2 . Let S := AtLtT^*' where to := r cont(n, n" 2 ). 
By Lemma 14.141 

Pr [£ ] ^ 1 - 2n~ 1 . 

(logn) 2e 



Moreover, by Lemma |4 . 1 5 1 with /3 



E 



j—^ — , we have 
A (t+/3) I St] < e ~ n((logn)2E) • A (t) , 



where e > is a sufficiently small constant. 
As in (|4.14p . we obtain that 



E 



A(*+/3) I £ < e -^((logn) 2£ ) . A (t)_ 
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Iterating this inequality r = C ■ ;g^fzr^ times starting from round to, and using A(*°) sC 9n 2 
gives 



E 



^(to+r-jS) | £• < e -r-n((logn) 2£ ) , A (t ) < 



if C > is a sufficiently large constant. Since r • /3 = O ( ) , Markov's inequality implies 
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Pr 
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Since A^ to+T '^ < 1 implies A( <0+T '^) = which is equivalent to a maximum load of one, the 
proof of Case 1 is complete. 

Case 2: d ^ e( lo s n ) 1/2 and > (logn) 1 / 4 . The proof of this case is very similar. By the 
second statement of Lemma 14.151 we have 



E 



A {t+/3) | St] < (log n) ~ e - A (t) , 



where (3 = (logn) 48e and e > is a sufficiently small constant. As the analysis in (|4.14|) . we 
obtain that 



E 



A(* +/? ) I 8 < 2- (logn)- £ • AW. 



Iterating this inequality r := logn times and using A(*°) < 9n 2 , gives 



E 



A (*o+r-/3) | £ ] ^ 2 ■ (logn)- £ - losri • A('°) 
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'(1) 



Note that r • /? = (logn) 48e • logn. Since ^ > (logn) 1 / 4 , t ■ (3 = O if e < j^. The rest 

of the proof is exactly the same as the proof in Case 1. 

Case 3: d > e^°& n )^ /2 . The proof of this case is the same as Case 1 and 2. In fact, it is 
even slightly simpler, because Lemma 14.171 implies an exponential drop of the potential A in a 
single round. 

Case 4: Balancing Circuit Model. Finally, we consider the balancing circuit model with 
a sequence of d = 0(1) matchings. Since d is a constant, it holds for every node u G V that at 
most d neighbors of u in G appear in the matching matrices M^, . . . , M^. For this reason, 
we assume w.l.o.g. that the underlying graph G has bounded maximum degree. Moreover, we 
can apply Corollary IB. 21 and Lemma lB.3l to conclude that for all nodes u S V and any f3 E N, 



M 
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2 < - + min { 0(P~ 1/2 ), A(M) 2/3 1 
2 n I J 



(4.20) 



Therefore, the same statements as in Lemma 14.131 Lemma 14.141 and Lemma 14.151 hold for the 
balancing circuit model with the only difference that every round in the random matching model 
corresponds to d consecutive rounds in the balancing circuit model. Consequently, the analysis 
of the balancing circuit model is the same as the analysis of the random matching model (Case 
1 and Case 2). ■ 



5 The Diffusion Model 

In the diffusion model, the (continuous) load vector £ W 1 in round t ^ 1 is given by the 
recursion = £(* _1 )p, where for any 7^1 the diffusion matrix P = P(7) of graph G is 
defined as follows: For u € V, ~P u ,v = ^ if {u, v} € E, P Uj „ = 1 — if u = v, and P u ,v = 
otherwise. Hence, 

Jt-1) Jt-i) 
At) = At-l) , \- ^ Z5± 

v:{u,v}£E 
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Common choices are 7 = 2 (resulting in a loop probability at least 1/2) or 7 = 1 + 1/A, both 
ensuring convergence also on bipartite graphs. Let Ai(P) ^ . . . ^ A n (P) be the eigenvalues of 
P and A(P) := max{|A 2 (P)|, |A n (P)|}. Note that A(P) depends on 7. Let P* be the t-th power 
of P and P° be the n by n identity matrix. 

As for the matching model, there is a natural upper bound on the convergence in terms of 
the spectral gap of P. 



Theorem 5.1 ( 35j, Thm. 1]). Let G be any graph and consider the continuous case. Then for 



any e > 0, the discrepancy is at most e after x-'\{p) 
with discrepancy at most K . 



• log( ^ n ) rounds for any initial load vector 



5.1 The Discrete Case 



We study two natural protocols in the discrete case of the diffusion model. One is the vertex- 
based protocol [8J, where excess tokens are allocated by vertices. The second protocol we study 
is the edge-based protocol .18] 

, where every edge performs an independent randomized rounding. 
The vertex-based protocol from 0] works for <i-regular graphs as follows. In round t, every 

node u sends first |_ x ^ +1 j tokens to each neighbor and keeps the same amount of tokens for 

(t— 11 1 1 

itself. After that, the remaining x u 1 — [d + 1) • j tokens at node u are randomly 

distributed (without replacement) among node u and its d neighbors. This corresponds to a 

diffusion matrix P with 7 = 1 + 1/d. 

Consider now the edge-based protocol [3], where the load sent along each edge is obtained 

by randomly rounding the flow that would be sent in the continuous case to a nearest integer. 

Let = G Z n . As in the matching model, we can derive the following expression for the 

deviation between the discrete and continuous model at some node w in round t: 



M _ At) 



At) = rr (»}pt-. = V V V e (s) P*~ 



=1 uev 



s=l u£V v: {u,v}eE 



t 



V V fP*~ s - P 1 - 3 ) 

/ j / j ^u,v \ u,w v,w I 1 

s=l [u:v]£E 



(5.1) 



where el*i is the rounding error for each edge [u : v] G E in round s defined by 



■(<,_!) Ja-XY 
7A 

Js-1) _ Js-X) 
Sf St* 

7A 



Js-X) _ Js-X) 
7A 

Js-X) _As-l) 
7A 



w. p. 



w. p. 



Js-X) _ Js-X) 
7A 



Js-X) _ Js-X) 
C,v C,u 



7A 



Js-X) _ Js-X) 
Sf Sit 

7A 

Js-X) _ Js-X) 

7A 



Moreover if — — is an integer, then e u ' v = 0. By definition, we have E 



e (s) 



0. 



Further, for any set of different (not necessarily disjoint) edges, their rounding errors within the 
same round are mutually independent. 



5.2 Local Divergence and Discrepancy 

Based on the deviation between the discrete and continuous case (|5.1|) . we now define the 
(refined) local p-divergence in the diffusion model. 
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Definition 5.2 (Local p-Divergence for Diffusion [81. 1 171. l35j|). For any graph G and p G Z + the 

local p-divergence is 

l/p 

*p( p )=^(E E K, u -ri 

t=0 [ U: »]£B 



>t IP 



and the refined local p-divergence is 



T P (P) = max - max |P^ 



IP* 


-p* ri 


I* w, u 





l/p 



Clearly, T p (P) ^ \& P (P). We now present our bounds on the (refined) local 2-divergence. 
Theorem 5.3. For any graph G and any 7 > 1 not necessarily constant, it holds that 



T 2 (P) ^ * 2 (P) < ^ 

Moreover, for any 7 > 0, we have ^(P) ^ "V^A and 

T 2 (P) > 



7 • A 



2/7' 



1 + A 



The upper bound on ^(P) is minimized for 7 = 2 and becomes in that case V2 • A. This 
result significantly improves over the previous bounds in [8], which all depend on the spectral 
gap 1 — A(P) or are restricted to special networks. The analysis of the edge-based algorithm in 
[181 ] did not use ^(P) or T2(P), but their bound on the discrepancy also includes the spectral 
gap. 

Proof of Theorem 15.31 Our proof uses a similar approach based on the same potential function 
as in [7j, Lemma 1]. However, we have to perform a more precise analysis to handle the case 
where 7 is very close to 1, i.e. 7 = 1 + 1/d which corresponds to the vertex-based protocol. 
By contrast, the proof in [7J, Lemma 1] is based on a sequential exposure of the edges and only 
works if 7 ^ 4. 

Fix any node w EV. Define the potential function in round t by 

2 



* w -£ (5. 4) ■ 



Clearly, $«» = 1 • (1 - ±) 2 + (n - 1) • (I) 2 = 1-1 

We first prove the upper bound on Let y u := P^7„. By the definition of the diffusion 
model, we have 

2 



Vu + 




Vu 




d{u) 
7A 



v&N(u) 



Vu 



A \Vv 

7 A V n> 



n 



+ 



d(u) 
7A 
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Using the notation E„ g jv(u) [X(v) ] for the expection of random variable X(v) where v 6 N(u) 
is chosen uniformly at random, we can rewrite the above expression and upper bound it using 
Jensen's inequality: 

*^E(^«[0-^)-( 9 «-i) + ^-(-i)]) 2 



7A 

d(u) 
7A 



Vu 



7A 

d(u) 
nJ ' 7 A 



+ 



2/t> 



n 



d(«) / 1 



7A 



n 



eJ^-(0-^M^)+^-^)) : 

"spy- ((-^)-(*-:)^- (*-=))'■ 



Note that 



ueV [u:v]£E 

and using the upper bound on $W from above, we obtain 
$(*-!) _ $(*) 



[u:v]eE 

- E < 



1 



d(u) 



+ 



d(v) 



d{u) 
7A 

d(v 



E 

[u:v]eE U v y 



7A 



1\2 1 



n/ ' 7 A 



+ 



n 



n 



d(u) 



d(u) 
7A 



■ [Vu 

n 



d{u) 
7A 



Vv 

n 



+ ^-(^-^) 2 -dR-(( 1 -^)-(^-^) + ^A 1 -(^-^), 
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where the last equality follows from p 2 — q 2 = (p + q) ■ (p — q). Further, 
1 // d(u)\ ( 1\ d{u) 



A = 



d{u) 



1 / 2 d(u) , A d(u) . 

2y« T-yyu -yv)] — t- • (y u - Vv) 



d(u) \ n 7A J 7A 

1 / \ / 2 \ 



7A V /V n 7A 

Similarly, we get 



1 / \ / 2 \ 



and thus 



A + B = — -[y u -y v y [2y u - - - — ■ [y u - y v ) - 2y v + - + — • (y v - y u ) j 



1 / \ / \ / d(u) + d(v) 



7 A \ y w J \& a civ J y 7A 

1 / \ 2 2A 

7AT-( y ""^) 2 -( 2 "^ 



Therefore, 



*«-!)_««) ^ _L. - ^ • y, (*-*)'. 



i.e. 



i A w,« w ,v / — 2/7 1 / 

[«:t)]6E 7 ' 



7A 

Finally, summing over all rounds gives 



2^ - 2 - 2/7 ^ v J ^ 2 - 2/7 2-2/7 r W 

t=l [«:»]££ ' ' t=l 



and * 2 (P) < \/ 2— 2Av • For the lower bound on ^(P); we consider a node w 6 1^ with d(tt>) — A 



2-2/7 
to obtain that 

* 2 (p) > I £ (n, u -pu 2 > V A -( 1 -°) 2 = 

y [k:»]6E 

In the same way, we prove the lower bound on T2(P): 

T ^( p ) >J\Z v f N % - P^) 2 = ^(A + 1) • 1 . 

Lemma 5.4. Consider the edge-based diffusion model. Fix two rounds t\ < t2 and the load 
vector at the end of round t±. For any family of non-negative numbers gu) v {[u : v] G 
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E, ti+1 ^ s ^ i 2 ), define the random variable Z := YH 2 =t 1 +i ^2[u-v]eE 9u}v'^u}v Then E [Z] = 
and for any 5 > it holds that 



( 



Pr[|Z-E[Z]| ^ 5] < 2exp 



\ 



The proof of Lemma 15.41 is the same as Lemma 12.121 except that the inner sum runs over 

(s) 

all edges of the graph and we have to use the slightly weaker inequality \e U) v\ < 1 instead of 
\eu)u\ ^ 1/2, which results into an extra factor of 4 in the denominator. 

We now use the above machinery to derive upper bounds on the discrepancy for the edge- 
based and vertex-based protocol. 

Theorem 5.5. Consider the edge-based protocol on an arbitrary graph G where 7 > 1 is any 
constant. Then: 



For any round t, 



• After O ( J rounds, the discrepancy is at most 0(\/Alog n) w.p. at least 1 — n 1 . 
Moreover, consider the vertex-based protocol on a d-regular graph G. Then: 

• For any round t, 



Pr 



max 

weV 



x 



(t) _ e (0 



o 



(a/A log n 



> 1 - 2n 



Pr 



max 



M 



*(*) 

S-u; 



n 



> 1 - 2n~ l . 



• After O ( ^pz^pj J rounds, the discrepancy is at most O (c^ydog n) w.p. /east 1 

Proof. We prove this result in the same way as Theorem 12.141 but now we invoke Lemma 15.41 
instead of Lemma \2. 121 Fix any node w G V, round t and define Z w := x$ — Q/ . By (|5.1|) . 

t 

Z - r W - - V V fP*~ s - P 1 ' 3 ) ■ e^ 

^ w Sui / j / j V w,u 1 v,u ) °u,tr 

s=l 

Applying Lemma 15.41 we have E [ Z w ] = and for any 5 > that 
Pr [\Z W \ ^5}^ 2exp 



n 



-1 



By the definition of the local 2-divergence, the denominator above is upper bounded by 8 
^ 2 (P) 2 , and we obtain for 6 = 4 1 /log n ■ ^(P) that 



Pr 



\Z W \ ^4 v / iog^-^ 2 (P) 



> 1 - 2n 



and the first statement follows by using the union bound and the upper bound on ^(P) from 
Theorem 15.31 The second statement follows directly by applying Theorem 15.11 
For the vertex-based algorithm, it was shown in [8j, Proof of Thm. 1.1] that 



Pr 



max 

weV 



-(*) _ f (t) 



0(v%gn-d-r 2 (P) 



Using Theorem[53]with 7 = 1+l/d (and A = d) gives T 2 (P) < ^(P) = 0(d), which yields the 
third statement. Finally, the bound on the discrepancy follows immediately from Theorem l5.11 
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A Concentration Inequalities 



The following concentration inequality is also known as "Method of Averaged Bounded Differ- 
ences" . 



Theorem A.l ([13, page 83]). Let Yy, . . . ,Y n be an arbitrary set of random variables and let f 
be a function of these random variables satisfying the property that for each £ G {1, . . . , n}, there 
is a non-negative q such that 

| E [ / | Y £ , Y t _ y , . . . , Yy ] - E [ / | Y(_ y , . . . , Yy ] I ^ c £ . 

Then for any 5 > 0, 

Pr[|/-E[/]| >S] ^2exp / 



The following result is borrowed from [13] . It can be shown easily using the Taylor expansion 
ofE[e* x ]. 



Lemma A. 2 ([13|, Problem 1.14]). Let Xy, X2, ■ ■ ■ , X n be independent 0/1-random variables 
with Pr[Xi = 1] = pi. Let X := Ya=i and p, = E[X] = Y17=iPi- V f or a ^ subsets 
SC{l,...,n}, Pr[/\ ieS (Xi = 1)] < Y[ ieS Pr[Xi = 1] , then it holds for all 5 > that 



Pr[X ^ (l + 5)p] s$ 



1+5 



The following lemma is a standard Chernoff bound for the sum of independent, identically 
distributed geometric random variables. 

Lemma A. 3. Consider some fixed < p < 1. Suppose that Xy, . . . ,X n are independent 
geometric random variables on N with Pr [Xi = k] = (1 — p) k ~ l p for every k E N. Let X = 
E"=i X i; p = E[X]. Then it holds for all /3 > that 



Pr[X ^ (l + (3)p] ^exp 



f3 2 -n 



2(1 + 0) 

We continue to define the notion of negative regression. 



Definition A. 4 ([1J, Definition 21]). A random vector X = (Xy, . . . ,X n ) E {0, l} n is said to 
satisfy the negative regression condition if for any two disjoint subsets I and J of {1, . . . , re} 
and any non- decreasing function f : {0, l}' 1 ' — > R, 

E[f(Xi,iei) I Xj = o-j,j g J ] 

is non-increasing in each o~j G {0, 1}, j G J . 

Lemma A.5 (0, Lemma 26]). Let X = (Xy,...,X n ) G {0, l} 7 be a random vector that 
satisfies the negative regression condition. Then for any index set I C {1, ...,n} and any 
non- decreasing functions fi,i E I, 



E 



Hfi(Xi 



HE[fi(Xi)}. 
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B Useful Inequalities for Markov Chains 



Lemma B.l ([27j, Lemma 3.4 and Remark 4]). Let Q be the transition matrix of a reversible, 
ergodic Markov chain with stationary distribution ir. Let a := ini x {Q x x : Q x ,x > 0} and c := 
'm{{ir(x)Q x ,y '■ x ^ y and Q x ,y > 0} > 0. Then for any two states x,y, and any round t ^ 0, 



QL < <y) + 



ac\/t + 1 



We now apply Lemma IB.ll to the balancing circuit model where the number of matchings d 
is a constant. Recall that M = Y\i=i and M* is the i-th power of the matrix M (which is 
different from M«). 

Corollary B.2. Consider the balancing circuit model with d = 0(1) matchings M^, . . . , M" 
such that M = Y\ d = i corresponds to an ergodic Markov chain. Then for any two nodes 
u, v £ V and any round t ^ it holds that 

M* < - + O ( — 

Proof. Note that M is a symmetric matrix which corresponds to a reversible, ergodic Markov 
chain with uniform stationary distribution. Further, in the notation of Lemma IB.ll a ^ lT d 
and c ^ (l/n)2~ d . Hence applying Lemma iB.ll to the round matrix M implies 



i ± 
M* < — I 3 — — 



The next lemma is a well-known fact in Markov chain theory. 



Lemma B.3 ( 32J, Lemma 1]). Consider the balancing circuit model for d matchings M^, . . . , . 
Let M = nf=i MW . Then, for any u E V it holds that 



< A 



2/ 



The following lemma is the corresponding result of Corollary IB. 21 for the random matching 
model. 

Lemma B.4. Let G be any d-regular graph and consider the random matching model. Fix any 
node u 6 V . Then, for any a with jj- cr _1 ^ 1, there is a constant c independent of a and n 
such that 



Pr 



M 



-i 



Proof. Consider a round s for which 
vector z, we have 



M^ 1 



«S e" 



^ (j . Since ||z||| ^ ||z||i 



holds for any 



M 



[M 



Let f be any node with Ts/V u ',v ^ o~ 1 - We continue with a case distinction on the degree of G. 
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Case 1: d ^ 4<t 2 . Since YlweNtv) ^ lj at least d/2 of the neighbors w E iV(u) satisfy 

mI 1 ^ ^ \ a <1 - Hence an edge {v,w} with l a ~ 2 lS included in the random matching 

in round s + 1 with constant probability. Assuming that this event happens, then 



M 



fceV,fc0{?;,u!} 
2 



M 



+ 2. 



'm [m] + M [1,s] 



M [1, 

- LVJ -n,. 

M [1, 



M [1 ' s] 



1 

2 
1 



Mil 



2 1 



M [M] +M [1 ' s] 

Y 1i,U 1 u.w 



M, 



-<7 



l.sl 



(B.l) 



-a 



-2 



Case 2. d ^ 4c 2 . Since ha 1 ^ - by the assumption on a, there is at least one node w 



with Mit^ ^ \o 1 . Consider now a shortest path P = (u\ = u, . . . ,U£ = w) from u to such a 

node w with the property that w is the first node on P with M^'j ^a^ 1 . By construction 
of P, the length t satisfies I — 1 ^ 1/(^<t _1 ) = 2a. Hence, there must be at least one edge 
{/> 9} £ E along the path P so that 



M 



[M 



M [1 ' s] 



M 



[M] 



l a~ x 



2a 



-a 



Note that the edge {/, 5} is included in the matching in round s + 1 with probability Q(l/c£) 
f2(o" -2 ). If the edge {/, g] is part of the matching, then we conclude from (jB.ip that 



M,^ s+1] 



M, 



1 

32' 



-a 



<: a~ 



Summarizing both cases, we can upper bound the minimum round r before M 

occurs by the sum of 32<r 4 independent random geometric variables with success probability 
Q.(d~ l ) = f2(o" -2 ) each. Using Lemma lA.31 we obtain that the sum of these geometric variables 
is larger than c • a with probability at most e _<T , if c is a sufficiently large constant. ■ 

Lemma B.5. Fix any sequence of matchings M. = (M^ , M^ 2 ) , . . .) and consider the continuous 
process. Assume that the time-interval [0, t] is (K,s) -smoothing. Then for any non-negative 



vector y with \\y\\\ = 1 it holds that 



^ e. Moreover, for any t ^ r cont (l,e) ; 



Proof. Let be any load vector with initial discrepancy at most K. Since [0, t] is (K, e)- 
smoothing, disc(£^) ^ e. That is, for all pairs of nodes u, v E V it holds that 



(*) 



Consequently, for all nodes w £ V, 



f (*) _ £ 



^ e. 
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Using the above inequality along with the triangle inequality, we get 



^ 1 Vw^w £ 
wev 



wev 



wev wev 

For the second statement, let y be the unit-vector which is one at component v and be the 
unit- vector which is one at component u. Then £ = ^, ffl = J2 w ev Cw^^-wfv = M^'. Hence 



V „ A*) _ Jt) _ M Li,tj i 



[M] 



M 



[M] _ 1 
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